Skip to main content

5 docs tagged with "OpenAI"

View All Tags

CLIP: Connecting text and images

We’re introducing a neural network called CLIP which efficiently learns visual concepts from natural language supervision. CLIP can be applied to any visual classification benchmark by simply providing the names of the visual categories to be recognized, similar to the “zero-shot” capabilities of GPT-2 and GPT-3. Although deep learning has revolutionized computer vision, current approaches have several major problems: typical vision datasets are labor-intensive and costly to create while teaching only a narrow set of visual concepts; standard vision models are good at one task and one task only, and require significant effort to adapt to a new task; and models that perform well on benchmarks have disappointingly poor performance on stress tests, casting doubt on the entire deep learning approach to computer vision.

Video generation models as world simulators

This report delves into a method for unifying visual data representation to facilitate the large-scale training of generative models and evaluates the capabilities and limitations of a model named Sora. Unlike previous works that often focus on specific types of visual data, Sora is a generalist model capable of generating videos and images of various durations, aspect ratios, and resolutions.

What OpenAI Really Wants

OpenAI, co-founded by Sam Altman, is on a mission to build artificial general intelligence (AGI) that is safe for humanity. The company's journey has been marked by significant breakthroughs, challenges, and transformations.