---
title: Automat(t)ic excerpts
date: 2025-11-12T11:03:19Z
modified: 2025-11-18T09:41:46Z
permalink: "https://dgw.ltd/2025/11/12/automattic-excerpts/"
type: post
status: publish
excerpt: ""
wpid: 602
categories:
  - AI
featured_image: "https://dgw.ltd/wp-content/uploads/2025/11/Screenshot-2025-11-12-at-10.59.38.png"
featured_image_alt: Running a WP CLI command to generate summaries for WordPress excerpts
---

I saw this post recently [Automating Content Previews with Local AI](https://cognition.happycog.com/article/when-2500-pages-need-summaries-automating-content-previews-with-local-ai), a really interesting approach to automating a something that would have had a large impact on people’s time. This is something I’ve had to deal with on a number of projects where there is a large content migration program, even at the most efficient if a task tasks a few minutes to complete if you have 1000s of content items 5 minutes x 1000 is 3.5 days. But people don’t work 24 hours a day, so even if you could spend your entire working day doing copy/paste/edit best case scenario this would still take a few weeks to complete.

Obviously there are methods to aid content migration and I’ve built a number of tools via WP CLI to migrate and update content, but often there is a certain amount of manual editing required (hands up who has command clicked ‘edit post’ 100 times and hand edited content?)

But this approach as outlined in the post – using a local LLM to generate content summaries was a fascinating idea. Let alone the environmental costs and energy consumption of running large data centres to run the LLMs like ChatGPT, Claude, Gemini – there is also a practical discussion here – why use cloud based models for everything? Local LLMs are becoming more and more powerful (obviously there is still an energy and environmental cost to training these modals as well) for example [Alibaba’s Qwen](https://en.wikipedia.org/wiki/Qwen) 2.5 model sits [somewhere between GPT-4o and 3.5](https://simonwillison.net/2024/Nov/12/qwen25-coder/) and was released under the Apache 2.0 license.

As Happy Cog’s post highlighted – one of the reasons for using a local LLM was they wanted to iterate the tool to improve the prompt and method used – rather than using Cloud-based AI services charging per token for each request.

> This project demonstrated that local AI can be a powerful, cost-effective solution for large-scale content processing tasks.

Naturally I wanted to look at building a version for WordPress and [WP CLI](https://wp-cli.org) using [Guzzle HTTP](https://github.com/guzzle/guzzle) to connect to the model. They used [Docker Model Runner](https://www.docker.com/blog/introducing-docker-model-runner/) to build and run the local LLM. In the end I decided to go with [Ollama](https://ollama.com), I already had it installed and it seems a bit more simple and lightweight. I considered using the new [PHP AI Client](https://github.com/WordPress/php-ai-client) but this didn’t support local LLMs which kind of defeats the point of what I wanted to replicate from Happy Cog’s approach.

They also very helpfully posted some of their code on the blog post so I was able to lean very heavily on that


```php
Generate AI summary for given content using local model
 *
 * @param string $content Cleaned content text (no HTML)
 * @return string|null Generated summary or null on failure
 */
public function generateAiSummary(string $content): ?string
{
    // Environment variables:
    // LLM_BASE_URL: http://localhost:12434 (or http://model-runner.docker.internal for Docker)
    // LLM_MODEL_NAME: ai/qwen2.5

    $client = new GuzzleClient([
        'base_uri' => App::env('LLM_BASE_URL'),
        'timeout' => 30, // 30-second timeout for processing
    ]);

    try {
        $response = $client->request('POST', '/engines/v1/chat/completions', [
            'headers' => [
                'Content-Type' => 'application/json',
            ],
            'json' => [
                'model' => App::env('LLM_MODEL_NAME'),
                'messages' => [
                    // Sample system message
                    [
                        'role' => 'system',
                        'content' => 'You are content copywriter. You must write exactly one paragraph starting with "Discover" or "Learn about". Maximum 120 words. Use simple, clear language. No bullet points or special formatting.'
                    ],
                    // Sample prompt
                    [
                        'role' => 'user',
                        'content' => "Write ONE concise paragraph under 120 words summarizing this content:\n\n```\n{$content}\n```"
                    ]
                ],
                'max_tokens' => 150,
                'temperature' => 0.7, // Slight creativity while maintaining consistency
            ]
        ]);

        if ($response->getStatusCode() === 200) {
            $json = json_decode($response->getBody()->getContents(), true);

            if (isset($json['choices'][0]['message']['content'])) {
                return trim($json['choices'][0]['message']['content']);
            }
        }
    } catch (Exception $e) {
        // Log error for debugging
        Craft::error("AI summary generation failed: " . $e->getMessage());
    }

    return null;
}
```

Ollama uses a different endpoint, namely /v1/chat/completions but the rest of the code translated well to WordPress and the prompts they created over many iterations were obviously very useful.

The approach I took was the following

1. Query posts via WP\_REST\_Request to a custom API endpoint (this was an endpoint I already had) with a fallback to WP\_Query
2. Clean up the posts via strip\_shortcodes, wp\_strip\_all\_tags and then truncating them to 6000 characters to avoid heavy processing
3. For each post send it to the local LLM and generate the summary
4. Save the summary via wp\_update\_post

This is all wrapped in a WP CLI command


```twig
wp ai-summary generate
```

And it worked pretty well:

![Running a WP CLI command to generate summaries for WordPress excerpts](https://dgw.ltd/wp-content/uploads/2025/11/CleanShot-2025-11-12-at-10.55.10@2x-1024x552.png)

As with other posts on this blog I don’t intend to use any of this in production but it offers a toolset that we can look to when faced with different challenges in projects.

## Update

Was considering this approach whilst writing this post, and naturally came to the idea, what about alt tags? I do write alt tags for my media, but sometimes I do miss the odd one. Not cool, but it happens from time to time.

I looked about and Qwen 2.5 does indeed a vision-language model, qwen2.5-vl. So I hooked up an image processor in a similar way.

1. WP\_Query ‘post\_type’ => ‘attachment’
2. Find any missing alt tags via \_wp\_attachment\_image\_alt
3. Generate alt text via AI image model and prompt
4. Update alt text

And it also worked, pretty damn well. Some examples:

![DeLorean time machine near a lake](https://dgw.ltd/wp-content/uploads/2022/12/IMG_6595-1024x768.webp)

A DeLorean car with its doors and trunk open, parked on a gravel beach near a calm body of water with boats


![Rocket emoji](https://dgw.ltd/wp-content/uploads/2025/04/rocket.png)

A cartoon rocket ship with flames shooting out, set against a vibrant pink background.