Prompt Sapper: LLM-Backed Production Tool That Creates AI Chains

https://medium.com/@cbarkinozer/prompt-sapper-yapay-zeka-zincirleri-olu%C5%9Fturan-llm-destekli-%C3%BCretim-arac%C4%B1-388f9804ec75

“Prompt Sapper: A LLM-Empowered Production Tool for Building AI Chains” makalesinin özeti.

[https://www.aichain.online/public/content%20pages/sappervsothers.html]

Abstract

The emergence of basic models such as GPT-4 and DALL-E has opened up a multitude of possibilities in various fields. People can now use natural language (i.e. prompts) to communicate with AI to perform tasks. While humans can use base models through chatbots (e.g. ChatGPT), chat is not a production tool for creating reusable AI services, regardless of the capabilities of the base models. APIs like LangChain allow LLM-based application development, but they require significant programming knowledge and thus present a barrier. To alleviate this, we propose the concept of an AI chain and incorporate the best principles and practices accumulated in software engineering over decades into AI chain engineering to systematize the AI chain engineering methodology.

We are also developing Prompt Sapper, a code-free integrated development environment that inherently incorporates these AI chain engineering principles and models in the process of building AI chains, thereby improving the performance and quality of AI chains. With Prompt Sapper, AI chain engineers can build prompt-based AI services on top of base models through conversational requirements analysis and visual programming. Our user study also evaluated and demonstrated the efficiency and accuracy of Prompt Sapper.

Summary

The text discusses the emergence of large language models (LLMs) and their potential in the development of artificial intelligence services.
It highlights the limitations of current chatbot-based interfaces and proposes the concept of AI chains to solve these problems. The authors present a code-free integrated development environment called Prompt Sapper that incorporates AI chain engineering principles and models.
A user study was conducted to evaluate the efficiency and accuracy of Prompt Sapper. The paper aims to systematize the AI chain engineering methodology and improve the performance and quality of AI chains.
The text discusses the development of a block-based visual programming tool called Prompt Sapper, which aims to simplify the process of programming AI chain services using large language models (LLMs). The tool eliminates the need for users to learn various API calls and search for online resources, making it more accessible to non-technical users.
The article highlights the importance of prompt engineering in leveraging the capabilities of LLMs and suggests different strategies to improve prompt performance. The article talks about current attempts at task decomposition and request chaining but points out their limitations and inflexibility.
The aim of the research is to systematize the AI chain methodology and develop a user-friendly IDE that combines the methodology and LLM co-pilots. The tool supports collaboration between different workers and models and allows users to download and distribute the AI services they create as plug-ins.
The article contributes to the democratization of foundation models and provides a comprehensive overview of relevant work and best practices. The text discusses various existing works and tools related to automating production processes, prototyping AI services, and developing AI chains.
This highlights the limitations of existing tools and proposes a comprehensive AI chain methodology and IDE that supports the development of AI services based on underlying models. The methodology includes activities such as system design, implementation and testing of AI chains.
The concept of prompt programming, which involves designing prompts to instruct employees on completing tasks, is introduced. The text emphasizes the importance of aligning task needs with model capability and iterative design and refining AI chains and prompts.
The use of workflow models, computational thinking principles, and software engineering practices when designing workers and prompts is mentioned. The text provides a framework for rapid prototyping and development of AI services using AI chains.
This text discusses the capabilities of the big language model in improving AI chain engineering. It focuses on two aspects: requirements elicitation and mechanical sympathy.
Sapper IDE is equipped with dedicated LLM-based co-pilots to support requirements elicitation and analyze mission requirements. The authors of the article emphasize the importance of understanding emerging AI behaviours and capabilities, as well as testing different prompts to deliver desired outcomes.
The iterative nature of AI chain design and construction, principles of computational thinking, and the use of modular design are emphasized. The article proposes to leverage LLM-supported requirements co-pilots to clarify and refine AI chain requirements.
It compares problem-solving approaches between humans and basic models and emphasizes the decomposition of complex problems into smaller subtasks. Workflow review and collaboration between AI chain workers are discussed, including the need for algorithmic thinking and control structures.
The text talks about the importance of defining function signatures and adhering to Grice's transformation principles in claims. AI chain testing, unit tests and the need for debugging are also briefly touched upon.
The article describes an AI chain IDE (Integrated Development Environment) that supports the development and testing of AI chains. The IDE focuses on putting people first and aims to help non-technical professionals complete AI chain tasks.
It supports a no-code development approach, allowing users to explore and design AI chains without programming skills. The IDE includes a Design view that helps elicit requirements and create an AI chain skeleton through the use of chatbots.
Users generated prompts and steps
you can modify it and the IDE automatically creates a block-based AI chain for execution and debugging in the Block view. The IDE uses block-style visual programming and is based on the open-source Blockly project.
This text describes the features and functions of the visual programming tool for creating AI chains. It explains the different types of blocks available, such as preworkers, containers, and code blocks, and how they can be used to build complex AI chains.
Organizing and managing blocks, running and debugging AI chains, and managing software artefacts are mentioned. The article describes the features of an AI chain IDE called Sapper, focusing on PromptHub and Engine Management functions.
The Prompt Center allows users to create and edit prompts in a structured manner, while the Engine Management feature allows different types of engines to be used to complete tasks.
The text talks about the potential benefits of the Sapper platform in reducing barriers to the development and use of deep learning models.
Provides details on a user study conducted to evaluate Sapper's ease of use and learning curve compared to native programming tools such as Python. The text discusses the results of a study comparing two programming tools, Sapper V2 and Python.
The study found that there was no significant difference in accuracy between the two tools, but participants spent significantly less time completing tasks using Sapper V2 compared to Python. The study found that Python was more error-prone and participants had difficulty finding the correct API calls, while Sapper V2 provided a simpler and more integrated interface.
Sapper V2 outperformed Python in all metrics except prevalence and visibility. The study compared Sapper V1 and Sapper V2; and found no significant difference in accuracy or time spent, but noted some issues with the current design look of the Sapper V2.
The study suggests that Sapper V2 is a more useful tool, but users have the flexibility to choose between Sapper V1 and Sapper V2 based on their preferences.

Images

Design view. The user first consults our co-pilot to find out his requirements 1), which are then summarized in 2) by another co-pilot. Once the user has made the job description specific enough, they click A to create the skeleton. In 3) they can change the skeleton by adding/deleting the control, and in 4) they can click B to change the input, prompts and models. After the user completes the change, C can be clicked to generate the block code in the Block View (next image).

Usability scores for native Python, Sapper V1 and Sapper V2.

Detailed time spent on each task. Unit second.

Conclusion

This article presents a systematic methodology for AI chain engineering, covering 3 key concepts and 4 activities to improve modularity, composability, debugability and reusability of AI chain functions. Combining this methodology, we introduce Prompt Sapper, a block-style visual programming tool that enables AI chain engineers, including non-technical users, to create prompt-based AI services using underlying models through conversational requirements analysis and visual programming. Sapper includes two view pages, Design View and Block View, with LLM co-pilots to help uncover requirements, code skeleton, and run/test the AI service. In a user study with 16 participants, we demonstrate Prompt Sapper's low barrier to entry and practicality and demonstrate its significant time-saving capabilities compared to traditional IDEs. The tool eliminates the need to learn various API calls and minimizes time spent searching for resources.