Automating Article Schema Implementation in Hubspot

Derek Hawkins
4 min readFeb 23, 2022
Hubspot logo on a piece of card stock

I have to admit, having made the switch from agency to in-house SEO has come with a slew of changes both in working style and mindset. Working in a tech startup with limited design/development resources and an emphasis on making sure my time is spent on high-impact activities, typical SEO recommendations and activities find themselves further down the backlog. In the continual hunt for not just implementing quick win optimizations, but scalable systems for doing so, one such activity was bulk Schema building/implementation, a not-so-simple feat in Hubspot.

Update (4/11/22): I presented this business case with the Hubspot Developer Advocacy team. You can see the broad overview of this article in the video below:

Why Not Just Use a Plug-in?

Funny enough, that is exactly what I wanted to do from the start! What I had mainly found in looking for the pathway of least resistance is one of the following.

  • Plug-ins that have unnecessary subscription costs that simultaneously have little to no (human) support or documentation
  • 3rd party applications that I would have to loop in multiple players in building the integration. ROI of taking marketing ops and dev time on schema is an all-uphill battle.
  • Universally no support for non-standard Schema types (A nice to have but not a need to have).

Ultimately, what I find going with the plug-in route is creating long-term dependencies in tools that you don’t have complete control over. For small organizations (sub 50 person companies who only have 1 or 2 people working on a website), I would still recommend using a plug-in (provided you want to pay) for ease of access. But, for SEOs and website managers who need a simple way of dynamic generation and injection of Article schema (along with a framework for additional automation of schema building/injection) in a Hubspot environment, this is the solution for you.

Getting Started

To start, you will need the following libraries in Python:

  • Hubspot: The official Hubspot Python API SDK
  • Genson: a JSON schema building library
  • A few core essentials that usually come packaged with any IDE in Python (Pandas, JSON, Requests, time)

Pulling the Right Data

There are a few different options for pulling out data, either directly from Hubspot or via API call. In our case, we will be pulling all of the information we need from the Hubspot API, including all public blog posts, their correlating IDs, and the information we need to build out the Article schema.

If you have never used the Hubspot API, you can get your API key by following these directions and brush up on the basics in their documentation.

Rather than using the Hubspot API SDK for this call, we will call the Blog Post API endpoint using requests for ease of access. Because of the 100 post limit, the Hubspot Blog Post API has, we will loop over iterations of 100 until we collect every blog post on our site. In this instance, we are working with a site with approx.1000 blog posts.

As we go through each iteration, we will be parsing the JSON response for relevant information for our schema, including the article title, author, description, publish date, modification date, and featured image. We will store our elements in a dictionary, pass that to a list and build a DataFrame. At the end of the loop, we will concat all our articles into a single DataFrame.

Building the Article Schema.

I’ve written about a Pythonic framework to build schema in the past, which we will be leveraging here as well. The function below uses similar elements to a function I’ve built in the past. While we will pull most of what we need via the Hubspot API, you will still want to manually incorporate the additional elements you’ll need for the Article schema to be valid.

With the data all in one place and the tools to build the schema outlined, we can iterate over our DataFrame and build the schema in a new column. We will use an f-string to make sure we have the JSON-LD <script> tag in place prior to adding to the post.

Implementation

Now that we have the schema built out, we can inject the schema directly into the html head of our blog post. Using the Hubspot API wrapper, we can simplify the authentication necessary to do the POST request. We’ll take the id we pulled from our initial Hubspot API call and our newly built schema and iterate over the entire DataFrame. I have added a sleep timer to ensure that the loop doesn’t timeout due to Hubspot API limitations.

With that, you’ll be all set! The Article schema will be live on-site and ready to validate using your preferred toolset. The implementation example above also has additional uses for site owners looking to bulk upload recommendations given by internal or external SEO resources.

Overall, in a time-sensitive position in which you want to ensure quick wins are quickly implemented but you find yourself resource-strapped and looking for an immediate way to act, this process is suited for you!

Feel free to reach out to me directly on LinkedIn or Twitter with any questions on this script or process!

--

--

Derek Hawkins

SEO Manager for @DominoDataLab | SEO/Growth Marketing | Writer | Programmer | Start-up Enthusiast