AI Tools for Software Engineers: Practical Uses Beyond the Hype

Unlock the real potential of AI tools for software engineers. Simon Willison shares practical tips on prompt engineering, GitHub Copilot, and leveraging LLMs effectively.

Key Takeaways

Large language models (LLMs) like GPT-4 are powerful tools for developers, acting like versatile calculators for text manipulation and code generation tasks.
GitHub Copilot significantly speeds up coding by suggesting completions, especially for boilerplate code, tests, and unfamiliar language features.
Effective prompt engineering is crucial; framing requests clearly, providing context, and iterating on prompts yields much better results from AI tools.
LLMs can hallucinate or produce incorrect information, requiring developers to critically evaluate and verify the generated code or explanations.
Tools like Simon Willison's llm CLI utility allow engineers to integrate LLMs directly into their terminal workflows for diverse tasks.
Beyond code generation, AI tools excel at explaining code, converting data formats, writing documentation, and summarizing complex information.
Developers should treat AI output as a starting point, always reviewing, testing, and refining the suggestions rather than blindly trusting them.
The rapid adoption of LLMs in development workflows is unprecedented, transforming how many engineers approach daily coding challenges.

Understanding LLMs as Developer Tools

Large language models should be viewed as powerful, albeit imperfect, assistants for software engineers, capable of accelerating many common tasks but requiring human oversight and critical evaluation. They are particularly adept at manipulating text and code based on prompts.
Simon Willison compares LLMs to calculators but for language; they automate tasks that were previously manual and tedious, like generating boilerplate code, writing tests, or explaining complex regular expressions, freeing up developer time for higher-level problem-solving.
A key capability is code generation. By providing a clear prompt, examples, or existing code context, developers can get suggestions for functions, classes, or entire scripts, significantly reducing typing and initial development time, though verification is essential.
These models can also act as powerful debuggers or code explainers. Pasting error messages or code snippets and asking for explanations or potential fixes often yields helpful insights faster than traditional searching, especially for obscure errors or unfamiliar libraries.
Data transformation is another strong suit. LLMs can convert data between formats (e.g., JSON to CSV, YAML to Python dictionaries), generate sample data based on schemas, or even write simple SQL queries based on natural language descriptions of the desired data.

Mastering Prompt Engineering for Better Results

Effective use of AI tools hinges on skillful prompt engineering. Instead of simple requests, developers should provide detailed context, specify the desired output format, and guide the model towards the correct solution, treating it like instructing a junior developer.
Providing examples within the prompt ("few-shot learning") dramatically improves output quality. If you want code in a specific style or data in a certain format, showing the model examples helps it understand the requirements more accurately.
Iterative refinement is key. The first response might not be perfect, so developers should analyze the output, identify flaws, and refine the prompt with more specific instructions or corrections, progressively guiding the model. "Explain this like I'm five" or "Rewrite this more concisely" are useful iterative commands.
Understanding the model's limitations, like potential hallucinations or generating plausible but incorrect code, is crucial. Prompts should encourage caution, for example, asking the model to "cite sources" (if applicable to the underlying data) or "explain the reasoning" behind a code suggestion helps identify potential issues.

GitHub Copilot in Practice

GitHub Copilot acts as an "autocomplete on steroids," integrating directly into the code editor and offering real-time suggestions as the developer types, drastically reducing the effort needed for repetitive code patterns and standard library usage.
It excels at writing unit tests, often generating relevant test cases based on the function signature and surrounding code context with minimal prompting, which encourages better testing practices by lowering the barrier to entry.
Copilot is particularly useful when working with unfamiliar languages or frameworks. It can provide idiomatic code snippets and usage examples, accelerating the learning curve and helping developers write code that adheres to community conventions.
While powerful, Copilot's suggestions require careful review. It can sometimes produce code that is subtly incorrect, inefficient, or introduces security vulnerabilities. Developers must remain vigilant and treat suggestions as starting points, not finished products.
- It might confidently generate code that uses non-existent library functions.
- It can sometimes miss edge cases or error handling requirements.
- Security best practices are not always followed in generated code.

Leveraging Command-Line and Custom AI Tools

Simon Willison developed the llm command-line tool to seamlessly integrate large language models into terminal workflows, allowing developers to pipe text, run prompts, and manage different models without leaving their preferred environment.
- This facilitates quick tasks like summarizing text files, generating commit messages, or explaining shell commands directly from the CLI.
- It supports various models (OpenAI, Anthropic, local models) and allows storing prompts for reuse.
Beyond general-purpose tools, developers can use LLM APIs to build custom tools tailored to specific project needs, such as custom documentation generators, specialized code refactoring scripts, or internal chatbots trained on project-specific knowledge.
Embedding generation is a powerful technique accessible via tools like llm. This converts text into numerical vectors, enabling semantic search, recommendation systems, and classification tasks, which developers can integrate into their applications.
The ability to run smaller, specialized LLMs locally (e.g., using LLaMA via ollama) offers advantages in terms of cost, privacy, and offline access, making AI capabilities accessible even for tasks involving sensitive data that shouldn't be sent to third-party APIs.

AI tools offer substantial productivity gains for software engineers, but require skillful application and critical oversight. Understanding prompt engineering and the limitations of these models is key to harnessing their benefits effectively.