Adam Caudill

Security Leader, Researcher, Developer, Writer, & Photographer

Generating Content Stats for Hugo

Producing useful insight into your content

I recently became curious just how much time I had spent working on content for this site, which led me to an idea: it would be great to have a page that listed some useful data about the content, and how much effort was put into it. I had some hope that I could pull some of this directly out of Hugo, though unfortunately it didn’t expose the information I wanted (and certainly not in an efficient way).

Building JSON Stats Data #

I quickly decided that the simplest solution here was to use a simple Python script to collect the data I needed. This would be easy to integrate into the Cloudflare Pages build process, and would make it simple to add new information over time.

In this case, I wanted a few data elements:

  • Total Number of Posts
  • Total Number of Words
  • Total Number of Characters
  • Total Estimated Reading Time
  • Total Estimated Writing Time

This are simple enough to gather, and I put this script together.

This script simply looks for all of the .md files in content/posts/, uses the frontmatter library to load the content and return the body without the frontmatter (so it doesn’t skew the numbers), and then collects the information from each file.

There are a couple somewhat arbitrary choices in here that can impact the results:

  • Reading Speed: The general consensus from a few minutes of searching is that the standard value used to estimate reading speed is 200 words per minute.
  • Writing Speed: According to one source the average writing time varies quite dramatically; from 5 words per minute for “in-depth essays or articles” to 40-70 words per minute for other writing. When factoring editing, revisions, collecting feedback on drafts, and other add-on time required to go from a raw collection of words to a ready-to-post article, the 5 words per minute number seems most accurate to me & for my workflow. That said, it could be completely different for you.

When this script is run, it will produce a stats.json file in the content/stats/ directory. It should look something like this:

{
   "blog_post_count":342,
   "blog_total_chars":1284579,
   "blog_total_words":197629,
   "blog_average_words":577.8625730994152,
   "blog_reading_time":988.145,
   "blog_writing_time":39525.8,
}

This can then be read in, and used in the content/stats/index.md, being displayed however you like.

Displaying the JSON Data #

While Hugo does have options for displaying data, such as with Data Templates, or more manually through the getJSON function, though these seemed inelegant for my needs. Thankfully I found another option; a shortcode that reads in a JSON file, and makes the values easy to use with a simple syntax: {{< jsondata src="data.json" var="max_date" >}} - this makes it extremely easy to embed JSON data wherever it’s needed.

I did need to make a change to the shortcode to suite my needs however, the way some larger numbers are displayed wasn’t quite what I needed, and using the format parameter wasn’t able to resolve it. So I added a check for numeric datatypes when a format wasn’t specified, and used Hugo’s lang.NumFmt to format the number in a more human readable way.

{{- $json_filename := .Get "src" | default "data.json" -}}
{{- $json_data_filepath := path.Join "content" (path.Dir .Page.File) $json_filename -}}
{{- if fileExists $json_data_filepath -}}
  {{- $json_data := getJSON $json_data_filepath -}}
  {{- $json_varname := .Get "var" -}}
  {{- $var_value := index $json_data $json_varname -}}
  {{- if $var_value -}}
    {{- $json_format := .Get "format" -}}
    {{- if $json_format -}}
      {{ printf $json_format $var_value }}
    {{- else -}}
      {{- $type := (printf "%T" $var_value) -}}
      {{- if or (eq "int" $type) (eq "int64" $type) (eq "float64" $type) -}}
        {{ $var_value | lang.NumFmt 0 }}
      {{- else -}}
        {{ $var_value }}
      {{- end -}}
    {{- end -}}
  {{- else -}}
    {{ errorf "Cannot get the value of the variable %s 
       from the data file: %s" $json_varname $json_data_filepath }}
  {{- end -}}
{{- else -}}
  {{ errorf "Cannot find the file: %s" $json_data_filepath }}
{{- end -}}

Hopefully this is useful for you as it is for me, and saves someone else some time.

Adam Caudill


Related Posts

  • Hugo & Content-Based Related Content

    During my Christmas vacation last year, I converted this site from WordPress to Hugo; while I’ve been happy with the change, a couple of features are missing. One of these is that there was a section with related content at the bottom of each post. I wanted to get it back. Thankfully Hugo has native support for Related Content, so while I was hoping this would be a simple task, there’s a note that made things substantially more complicated: