AI-powered flight booking? A new model makes it possible
Marie Donlon | January 29, 2025Joining the ranks of emerging large language models, like ChatGPT, artificial intelligence (AI) specialists and programmers at China’s Tsinghua University have developed a graphical user interface (GUI) agent model dubbed UI-TARS.
Developed in conjunction with ByteDance, TikTok’s parent company, Tsinghua University’s UI-TARS has been developed to function as a GUI agent model that can be used locally on a personal computer or via the cloud on other devices to perform mundane tasks. One such task that the smart agent promises to automate is sourcing the cheapest airline fares for a flight between two cities and then subsequently purchasing those tickets — a task that when done manually requires time-consuming web browsing.
Source: arXiv (2025). DOI: 10.48550/arxiv.2501.12326
To enable this, the model was trained using 50 billion tokens that symbolized characteristics of a GUI (via screenshots) — like those on traditional web pages. The team also relied on reflection tuning to train the model. This process involved programming the model to learn from mistakes and then to adapt and eventually modify its approach to different or unknown situations.
The Tsinghua team explained that when running UI-TARS, a user is introduced to two tabs: one that shows the app’s "thinking process" as it performs its overall task. The second tab displays the websites, files or other GUIs that the app is working with. As such, when used to book a flight, the user is able to see the airline websites being viewed and could then switch tabs to observe what the app is doing with them.
The user is then presented with the final web page, which prompts the user to confirm the ticket purchase. During trials, the model outperformed other AI models including GPT-4o and Gemini-2.0.
The model is detailed in the article, “UI-TARS: Pioneering Automated GUI Interaction with Native Agents,” which appears in the journal arXiv.