Skip to main content

3 docs tagged with "machine learning"

View All Tags

CLIP: Connecting text and images

We’re introducing a neural network called CLIP which efficiently learns visual concepts from natural language supervision. CLIP can be applied to any visual classification benchmark by simply providing the names of the visual categories to be recognized, similar to the “zero-shot” capabilities of GPT-2 and GPT-3. Although deep learning has revolutionized computer vision, current approaches have several major problems: typical vision datasets are labor-intensive and costly to create while teaching only a narrow set of visual concepts; standard vision models are good at one task and one task only, and require significant effort to adapt to a new task; and models that perform well on benchmarks have disappointingly poor performance on stress tests, casting doubt on the entire deep learning approach to computer vision.

LLM Powered Autonomous Agents

Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer, and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays, and programs; it can be framed as a powerful general problem solver.

What's next for AI agentic workflows ft. Andrew Ng of AI Fund - YouTube

Andreu, a renowned computer science professor at Stanford and creator of Coursera and Google Brain, discusses the exciting trend of AI agents. Traditional AI models work in a non-agentic workflow, where a prompt generates an answer. This is akin to typing an essay without using backspace. In contrast, an agentic workflow involves iterative processes like outlining, researching, drafting, and revising, leading to remarkably better results.