NEWSAR
Multi-perspective news intelligence
VA

Vision Language Action

Organization

In robot learning, a vision-language-action model (VLA) is a class of multimodal foundation models that integrates vision, language and actions. Given an input image of the robot's surroundings and a text instruction, a VLA directly outputs low-level robot actions that can be executed to accomplish the requested task.

Mentions:0
7 Days:0