I recently became curious just how much time I had spent working on content for this site, which led me to an idea: it would be great to have a page that listed some useful data about the content, and how much effort was put into it. I had some hope that I could pull some of this directly out of Hugo, though unfortunately it didn’t expose the information I wanted (and certainly not in an efficient way).
I quickly decided that the simplest solution here was to use a simple Python script to collect the data I needed. This would be easy to integrate into the Cloudflare Pages build process, and would make it simple to add new information over time.
In this case, I wanted a few data elements:
- Total Number of Posts
- Total Number of Words
- Total Number of Characters
- Total Estimated Reading Time
- Total Estimated Writing Time
This are simple enough to gather, and I put this script together.
This script simply looks for all of the .md
files in content/posts/
, uses the frontmatter
library to load the content and return the body without the frontmatter (so it doesn’t skew the numbers), and then collects the information from each file.
There are a couple somewhat arbitrary choices in here that can impact the results:
- Reading Speed: The general consensus from a few minutes of searching is that the standard value used to estimate reading speed is 200 words per minute.
- Writing Speed: According to one source the average writing time varies quite dramatically; from 5 words per minute for “in-depth essays or articles” to 40-70 words per minute for other writing. When factoring editing, revisions, collecting feedback on drafts, and other add-on time required to go from a raw collection of words to a ready-to-post article, the 5 words per minute number seems most accurate to me & for my workflow. That said, it could be completely different for you.
When this script is run, it will produce a stats.json
file in the content/stats/
directory. It should look something like this:
{
"blog_post_count":342,
"blog_total_chars":1284579,
"blog_total_words":197629,
"blog_average_words":577.8625730994152,
"blog_reading_time":988.145,
"blog_writing_time":39525.8,
}
This can then be read in, and used in the content/stats/index.md
, being displayed however you like.
While Hugo does have options for displaying data, such as with Data Templates, or more manually through the getJSON
function, though these seemed inelegant for my needs. Thankfully I found another option; a shortcode that reads in a JSON file, and makes the values easy to use with a simple syntax: {{< jsondata src="data.json" var="max_date" >}}
- this makes it extremely easy to embed JSON data wherever it’s needed.
I did need to make a change to the shortcode to suite my needs however, the way some larger numbers are displayed wasn’t quite what I needed, and using the format
parameter wasn’t able to resolve it. So I added a check for numeric datatypes when a format wasn’t specified, and used Hugo’s lang.NumFmt
to format the number in a more human readable way.
{{- $json_filename := .Get "src" | default "data.json" -}}
{{- $json_data_filepath := path.Join "content" (path.Dir .Page.File) $json_filename -}}
{{- if fileExists $json_data_filepath -}}
{{- $json_data := getJSON $json_data_filepath -}}
{{- $json_varname := .Get "var" -}}
{{- $var_value := index $json_data $json_varname -}}
{{- if $var_value -}}
{{- $json_format := .Get "format" -}}
{{- if $json_format -}}
{{ printf $json_format $var_value }}
{{- else -}}
{{- $type := (printf "%T" $var_value) -}}
{{- if or (eq "int" $type) (eq "int64" $type) (eq "float64" $type) -}}
{{ $var_value | lang.NumFmt 0 }}
{{- else -}}
{{ $var_value }}
{{- end -}}
{{- end -}}
{{- else -}}
{{ errorf "Cannot get the value of the variable %s
from the data file: %s" $json_varname $json_data_filepath }}
{{- end -}}
{{- else -}}
{{ errorf "Cannot find the file: %s" $json_data_filepath }}
{{- end -}}
Hopefully this is useful for you as it is for me, and saves someone else some time.