BBBBlog

Jason Burns – IT Superhero

Obsidian Properties showing nested YAML frontmatter

Nested YAML Frontmatter for Obsidian Book Notes

After a few happy years of storing all my books in a private Obsidian vault instead of Goodreads, I see a small thing that needs changing in the metadata. I’ve dreaded fixing this because it’s a lot to think through all the cases. But, here goes!

Background

Based on my experience with Goodreads to Obsidian, I created some YAML frontmatter to deal with series name and number. The frontmatter helps you treat each book note like an entry in a database, and you can query all your books based on this data! It’s what started me on this path in the first place!

series:
  series_name: Series Name Here
  series_num: 1

I BELIEVE that I generated valid YAML frontmatter, but I don’t know a lot about these things, so I was just winging it based on what made sense. Looking at the Github examples above, they use nested values right away.

It seemed to me like “Series” was an object comprising at least two items for that book, name of the series the book belong to, and number of the book within the series. There may have been more that I couldn’t think of, so I went with this nested structure.

Then I imported 400 book notes into my Obsidian with nested frontmatter data. This import and frontmatter were all handled programmatically through a template system setup in Converting from Goodreads to Obsidian. Any new books are also added through a separate, but similar, template in the Book Search plugin.

Problem Statement

Along comes the Obsidian devs and they helpfully turned frontmatter into Properties. For most folks and use cases this is probably really nice. Even for me, it’s nice most of the time. You can set your property types and update them through GUI drop downs and checkboxes.

HOWEVER, Obsidian Properties don’t support nested YAML data!

Judging by the age and number of comments on this thread, I’m not the only one with the problem.

Here’s what a nested frontmatter YAML looks like in Obsidian source view. Reasonable:

series:
  series_name: The Wheel of Time
  series_num: 6

HOWEVER, here’s what it looks like when you turn off source mode in the normal view:

Obsidian Properties showing nested YAML frontmatter

The property series gets turned into an unknown Property icon by Obsidian, and the property value gets turned into this pretty ugly string made for computers:

{"series_name":"The Wheel of Time","series_num":6}

To edit any of these series name or number values by hand, I need to switch to source mode.

Another interesting note is that this is JSON encoding, where both the key string and the value (if a string) require quotes. However, I entered the frontmatter data in YAML (made for humans), where apparently neither string keys nor string values require quotes. This will likely trip me up many times in my life. Maybe I’ll remember it from now on?

Potential Solutions

So my solutions for my own Obsidian vault are perhaps one of these three:

  1. Wait for Obsidian to support nested YAML frontmatter natively. (Or git gud at coding and write it myself.)
  2. Adopt a user maintained plugin that supports nested frontmatter.
  3. Since books are the ONLY note type that uses nested frontmatter in my vault, flatten the data structure for all 400+ of my book notes.

Let’s address them in turn:

1 – Waiting

I’m already doing this, and got fed up enough that I’m spending a morning writing this blog and looking for solutions. So – this one is out.

2 – Using a Plugin

There is a plugin that does exactly what I want, but also more than I need. The Electronic Lab Notebook supports nested frontmatter (Properties) editing. It also adds a note creation dialog, image viewer, and an interactive periodic table of elements. It looks pretty cool, but I have no need for the extra items.

I watched a ridiculous video the other day that listed a workstation setup “telos” including:

Try not to stray from defaults.

– ThePrimagen

Regardless of what you think of the presentation style, this idea resonates with me. I spent years customizing Gentoo Linux systems, keeping them all running for various tasks, with specialized configuration files on each one. Complicated workflows ruled my life. The thought of rebuilding that complexity in a new system is why I’ve switched to Debian on the server and Pop!_OS on my desktop for most things.

I just need to get work done.

The more plugins I add to my Obsidian, the more configuration I layer on top, the harder my maintenance tasks become. This plugin was a bridge too far, especially considering that the job of my Obsidian is to track my journal, track my notes, and WRITE. It’s pretty critical to my life and happiness workflow these days, so I need it to be as simple as possible.

A plugin for Tasks makes my life better. A plugin for Books makes my life better. A plugin for managing nested Properties? No. Just another thing to maintain that can break.

3 – Flatten The Data

Honestly, the solution is right in front of my face. Just flatten the frontmatter. There isn’t any real need to nest here. I already have unique prefixes on the values anyway.

Nested:

series:
  series_name:
  series_num:

Flattened:

series_name:
series_num:

This is short and simple, so why is this blog post so long? Implementation details.

Implementing the Solution

There are at least three problems I need to solve to implement this very straightforward solution.

  1. Update 400+ existing notes without breaking anything.
  2. Update The Book Search template for adding new notes.
  3. Update the Goodreads to Obsidian converter project and consider my devoted user base.

1- Converting Over 400 Notes

There are python modules that make it easy to manage YAML frontmatter. I’m not very worried at how difficult this will be, but I have to sit down and write the python, test that it works on some sample vault, and then publish it. The worry comes in making it robust enough for other people to use without deleting their book notes or harming their vault.

The are a few approaches here are to either:

  • Write some python that reads in, edits, and modifies the frontmatter in a YAML aware way.
  • Write a quick script that just deletes one line, and two leading spaces on two lines.
  • Use search and replace in a text editor like Visual Studio Code.

The Python approach has the appeal of working for a broad audience across platforms. I haven’t touched any python in years, so it might take me half a day to implement and test this method.

The shell script approach has the appeal of taking 15 minutes to implement and test. However, it likely would be Linux specific.

The text editor approach is a little more manual for folks, but probably the simplest for me to just write up.

After a few days thinking about this, I spent an afternoon trying to write my own tool to flatten the frontmatter in a single file, but wound up asking AI to finish it. DuckDuckGo makes a duck.ai service available, so I asked GPT-4o mini. Here’s the first pass.

My Prompt:

Can you write a python script that takes as input a directory full of markdown files. For each of those markdown files, modify the YAML frontmatter to change from the following structure.

series:
    series_name: St. Leibowitz
    series_num: 1

Assume that all of the files in the folder have existing frontmatter with a series value. Can you read that series, series_name, and series_num value in all of the existing files, and then flatten the structure so series_name and series_num are root level fields?

And the code it spit back after probably 4 seconds or less was pretty close to what I had spent 4 hours tinkering on, because I write slow.

NOTE: Do not copy the Python code below. It has errors.

import os
import frontmatter

def flatten_series_in_yaml_frontmatter(directory):
    # Iterate through all files in the specified directory
    for filename in os.listdir(directory):
        if filename.endswith('.md'):
            file_path = os.path.join(directory, filename)

            # Read the markdown file
            post = frontmatter.load(file_path)

            # Check if 'series' exists in the frontmatter
            if 'series' in post:
                series = post['series']

                # Flatten the structure
                post['series_name'] = series.get('series_name')
                post['series_num'] = series.get('series_num')

                # Optionally, remove the original 'series' key
                del post['series']

                # Save the modified frontmatter back to the file
                with open(file_path, 'w') as f:
                    frontmatter.dump(post, f)

if __name__ == "__main__":
    # Specify the directory containing markdown files
    directory_path = input("Enter the directory path containing markdown files: ")
    flatten_series_in_yaml_frontmatter(directory_path)

But that code is kind of a mess. Look at all that nesting! Just garbage practices, so I asked:

Can you rewrite the code so it doesn't nest as deeply, by separating the different tasks into different functions?

And it said the following in about 2 seconds:

Note: Do not copy the Python code below. It has errors.

import os
import frontmatter

def load_markdown_file(file_path):
    """Load a markdown file and return its frontmatter and content."""
    return frontmatter.load(file_path)

def flatten_series(post):
    """Flatten the series structure in the frontmatter."""
    if 'series' in post:
        series = post['series']
        post['series_name'] = series.get('series_name')
        post['series_num'] = series.get('series_num')
        del post['series']  # Remove the original series key

def save_markdown_file(file_path, post):
    """Save the modified frontmatter back to the markdown file."""
    with open(file_path, 'w') as f:
        frontmatter.dump(post, f)

def process_markdown_files(directory):
    """Process all markdown files in the specified directory."""
    for filename in os.listdir(directory):
        if filename.endswith('.md'):
            file_path = os.path.join(directory, filename)
            post = load_markdown_file(file_path)
            flatten_series(post)
            save_markdown_file(file_path, post)

if __name__ == "__main__":
    # Specify the directory containing markdown files
    directory_path = input("Enter the directory path containing markdown files: ")
    process_markdown_files(directory_path)

That’s better! It could have more precise logic to only read in the markdown files, but I think this will work. I suppose what’s left is to test it out and see if it even runs!!

OK – this did not run. 🙁

Right out of the gate, it threw errors for me. It took me about 30 minutes to figure out why this program was:

  1. Throwing errors for the first file that it read in.
  2. Writing out completely blank files instead of stopping, which erased important data in my thankfully temporary test directory.

The problems were that the program above does NOT

  1. Check that the file has frontmatter.
  2. Use the correct write syntax (wb instead of w).
  3. Use the correct sort_keys=False

Thanks to the 4 hours I had put in last week, I was able to find a working write example and fix it up as follows. Importantly, when I asked ChatGPT to print the post without reordering the keys, it suggested that I should import another module and manually write out the frontmatter from a dictionary. I guess it didn’t know you could just pass sort_keys=False. Maybe it needs to read the documentation?

import os
import frontmatter

def load_markdown_file(file_path):
    """Load a markdown file and return its frontmatter and content."""
    print(f'Working on file {file_path}')
    if (frontmatter.check(file_path)): # Added a check for frontmatter
        post = frontmatter.load(file_path)
    else:
        print(f'Skipping. No frontmatter in {file_path}.')
        post = False
    return post

def flatten_series(post):
    """Flatten the series structure in the frontmatter."""
    if 'series' in post:
        series = post['series']
        post['series_name'] = series.get('series_name')
        post['series_num'] = series.get('series_num')
        del post['series']  # Remove the original series key

def save_markdown_file(file_path, post):
    """Save the modified frontmatter back to the markdown file."""
    with open(file_path, 'wb') as f: # needs 'wb'
        frontmatter.dump(post, f, sort_keys=False) # needs a sort_keys

def process_markdown_files(directory):
    """Process all markdown files in the specified directory."""
    for filename in os.listdir(directory):
        if filename.endswith('.md'):
            file_path = os.path.join(directory, filename)
            post = load_markdown_file(file_path)
            if (post):
                print(f'Post before flattening is:')
                print(frontmatter.dumps(post, sort_keys=False))
                flatten_series(post)
                print(f'Post after flattening is:')
                print(frontmatter.dumps(post, sort_keys=False))
                save_markdown_file(file_path, post)

if __name__ == "__main__":
    # Specify the directory containing markdown files
    directory_path = input("Enter the directory path containing markdown files: ")
    process_markdown_files(directory_path)

OK, now it’s time to QA this a little more.

QA found the following issues, but I don’t think they are going to stop me from publishing:

  1. Any trailing newline in your original note is deleted.
  2. Any YAML list in brackets on a single line is converted to multiple lines with dashes.
  3. Any YAML string with quotes no longer has quotes
  4. Any YAML empty value in the original note will now have Null

I think I can just put these caveats in the script and call it good. Those things are all part of proper YAML formatting. My book notes going INTO the process were a bit messy, and coming out of the process they’ll all be better formatted.

2 – Book Search Template Updates for Obsidian

So that’s the existing book notes out of the way. What about the new book notes we add with the awesome book search plugin in the future? We don’t want them nested moving forward.

These templates are used by the Book Search plugin within Obsidian. Whenever you add a new book, it creates a note based on the data from Google APIs, merged into this template.

This is really simple to update. My current template is:

---
tags:
  - book
  - media
publish: false
title: "{{title}}"
aliases: 
series:
  series_name: 
  series_num: 
author: [{{author}}] 
status: 
isbn: {{isbn10}}
isbn13: {{isbn13}}
category: [{{category}}]
rating: 
read_count: 
binding: 
num_pages: {{totalPage}} 
pub_date: {{publishDate}}
cover: {{coverURL}}
date_add: 
date_start: 
date_end: 
created: 2025-07-12 10:12
---
# {{title}}

## Description

{{description}}

## Review

The updates would just be:

---
series_name: 
series_num: 
---

I could also investigate whether the Book Search project has metadata about series names and numbers. It does not, but it may be something Google APIs has that we could add to the upstream Book Search. Google APIs does not appear to include any information about the series name and volume number in their Google Books API definition. However, when you go to a Google Books page on a book, you can find “Followed by:” and “Preceded by:”. This indicates some knowledge of the series and order exists somewhere. But not in a way that I find approachable.

3 – The Goodreads to Obsidian Converter

This is the hardest part. I want to change the frontmatter handling for a published project. However, I know of AT LEAST ONE person who has commented that they’re using my project. There are SIX people that have starred the project on GitHub.

I created an issue to at least track that I’m thinking about this problem.

The individuals out there in the world using the converter on version 1.0 would never know I’ve made this flattened change unless they follow GitHub, so maybe I don’t need to do anything for them. They’d keep using nested frontmatter. However, if they came back to GitHub for instructions, they’d find a template and dataview query that no longer works for them. They’d be forced to maintain their old setup, or migrate all their notes to the new flat way.

The individuals who discover my project AFTER I make a potential flattening would be using the new flat conversion, templates, and dataview by default. No problems there.

So the questions are

  • Do I publish my own conversion process in the project?
    • I think the obvious answer here is yes.
    • More python for me to write and publish.
  • Do I update the existing project to the new flat style?
    • I think this is a good idea.
    • It just requires updating the template file in the project, and the dataview query examples
    • I’d have to leave some note or warning that they changed.
  • Do I maintain the old nested style and provide instructions and templates for both flat and nested?
    • No – I don’t think I want to. I’ll just provide some instructions for how folks can handle this change.
    • They can choose to stay nested forever – or they can choose to migrate. If they stay nested forever, they’ll have to figure out on their own how to maintain their own templates and dataview queries moving forward. That shouldn’t be so hard if they made it THIS far.

Conclusion

After a few hours backing notes up, testing, and debugging, these changes are now live in my own vault and updated in the Github project.

Thanks for following along if you’ve made it this far. This development log has been a very helpful way for me to think through all the parts of solving a problem.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.