Need More SEO Data? Consider the SEMRush API in your Workflow (Updated!)
I am a huge fan of SEMrush (and no I’m not getting paid by SEMrush to say that… not yet at least). I learned the basics of SEO through their free trial accounts and they were the main tool I utilized on my first every SEO project. But, all brand evangelism aside, there is always an unquenchable thirst for more data and greater access to what SEMrush has to offer. As I have continued to put in play automation into the SEO process, I knew there would have to be a way to utilize the tools that I have become so accustomed to in a way that is faster, more streamlined and more all-encompassing than how I was using them since day one.
For those who don’t know, an API (short for Application Programming Interface) is a set of code that allows for communication between software whether that be two sites communicating, or just lines of a script. I won’t go into too many details about how APIs work (I couldn’t even if I tried) but I will go off about how pretty much every SEO tool on the market has one. But, for this particular instance, I’ll be going in depth with one of my personal favorites. Why I’m really writing this piece is so you don’t have to experience the same hours upon hours of research I had to go through to end up with something you can start using right away.
Right off the bat, I’ll be expanding on how to access the SEMrush API via Python. I haven’t met too many marketers who are proficient in PHP or Java and it’s probably the easiest language for a non-developer (like me) to pick up and go.
To begin, you will want to grab this file from GitHub and follow SEMrush’s documentation for getting a key which you will need to access the API (note, you will need a paid subscription to access the API). Once you have the file downloaded and the key generated, place the GitHub file into wherever on your computer your IDE holds modules and unpack it using the lines below.
Place your generated key inside of the parenthesis and you are already ready to begin pulling data directly from SEMrush! What does this all mean? Well, let me show you some of the different things you can do with all of this access.
Using the SEMRush API via Wrapper
from python_semrush.semrush import SemrushClient
import pandas as pd
client = SemrushClient(key='your_semrush_api_key')
For those who do not want to rely on third-party wrappers, you can construct the API call using urllib to create the URL for the call and then utilize requests to call on the server.
While the wrapper provides ease of access to the API, it limits the customization that can be done with the calls. I have also found that the wrapper is not actively updated to feature the newest API features that SEMRush has to offer (such as Version 4 which allows for POST request creation of Site Audit Campaigns or Version 3 updates that allow for GET requests of traffic data). For some of the core calls that I’ll go over here, they should all be accessible via the wrapper. But for those who want to take learnings here and apply them beyond this article, I would recommend utilizing the functions more long term.
Using the SEMRush API via Separate Functions
import urllib
from urllib.parse import urlparse
import requests
import pandas as pd### Load API Key and root url for API call ###api_key = 'your api key'
service_url = 'https://api.semrush.com'### Function used to monitor credit use for SEMrush API ###def semrush_call(call_type, phrase):
params = {
"?type": call_type,
'key': api_key,
'phrase': phrase,
'database': 'us', # change for different market
'display_limit': '10',
}
data = urllib.parse.urlencode(params, doseq=True)
main_call = urllib.parse.urljoin(service_url, data)
main_call = main_call.replace(r'%3F', r'?')return main_call### Function used to parse data from semrush_call###
def parse_response(call_data):
results = []
data = call_data.decode('unicode_escape')
lines = data.split('\r\n')
lines = list(filter(bool, lines))
columns = lines[0].split(';')for line in lines[1:]:
result = {}
for i, datum in enumerate(line.split(';')):
result[columns[i]] = datum.strip('"\n\r\t')
results.append(result)return results
Instant Bulk Keyword Research
I know, I know, Google Keyword Planner usually takes the cake for the most utilized too for bulk keyword researching. But, with the new Google Ads taking over and the limits they put in place for how much data you can access at one time, using Keyword Planner can sometimes take up more time than you want. Using SEMrush’s API as a new keyword planner allows you to bulk import up to 50,000 keywords and get results from hitting enter to CSV full of search volume, competition, and CPC in under a minute. Just plug and play the code below.
from python_semrush.semrush import SemrushClient
import pandas as pd
client = SemrushClient(key='your_semrush_api_key')yourKeywords = ['movie theater popcorn', 'cheddar cheese popcorn', 'kettle corn', 'kettle popcorn', 'caramel popcorn']
#youKeywords can be substituted for a column of a csv file using pandas
a = []
for every in yourKeywords:
keywordInfo = client.phrase_this(phrase=every, database='us')
a.append(keywordInfo.copy())finalKeywordData = pd.DataFrame(i[0] for i in a)
finalKeywordData.to_csv("EnterYourCSVFileNameHere.csv")
Bulk Keyword Research without the Wrapper
yourKeywords = ['movie theater popcorn', 'cheddar cheese popcorn', 'kettle corn', 'kettle popcorn', 'caramel popcorn']
#youKeywords can be substituted for a column of a csv file using pandas
keyword_frame = []
for every in yourKeywords:
data=requests.get(semrush_call(call_type='phrase_all', every))
parsed_data = parse_response(call_data=data.content)
df = pd.DataFrame(parsed_data)
keyword_frame.append(dffinalKeywordData = pd.concat(keyword_frame)
finalKeywordData.to_csv("EnterYourCSVFileNameHere.csv")
Go Past SEMRush’s Onsite Reporting Limits
Want to really understand your site’s overall ranking but ranking for over 50,000 keywords. Instead of waiting for a custom report, almost double the information you can get about a domain in just seconds. Just filter based on positionality using “pos_asc” and “pos_des” to gain access to your top 50K and bottom 50k keywords. Working with an international site? Make sure to change the database to your country’s specific database code.
from python_semrush.semrush import SemrushClient
import pandas as pd
client = SemrushClient(key='your_semrush_api_key')domainOrganic = 'yourdomain.com'
domainResult = client.domain_organic(domain=domainOrganic, database='us', display_limit=50000, display_sort='po_asc')
domainDataFrame = pd.DataFrame(domainResult)
domainDataFrame.to_csv("EnterYourCSVFileNameHere.csv")
SERP Scrape Without Needing to Build a Scrapper
SERP data provides a ton of insight into the competitive landscape, who is dominating for what terms and where your competitive advantage may lie. But often, it becomes a tedious act of clicking through the SERPs manually or digging around looking for prebuilt bots which will almost certainly get caught by Google. Instead, get your data right from SEMRush using a series of functions to call on the data, parse it and build it out into one complete excel file.
def build_seo_urls(phrase):
params = {
"?type": "phrase_organic",
'key': api_key,
'phrase': phrase,
'database': 'us',
'display_limit': '10'
}
data = urllib.parse.urlencode(params, doseq=True)
main_call = urllib.parse.urljoin(service_url, data)
main_call = main_call.replace(r'%3F', r'?') return main_call
def parse_response(call_data):
results = []
data = call_data.decode('unicode_escape')
lines = data.split('\r\n')
lines = list(filter(bool, lines))
columns = lines[0].split(';') for line in lines[1:]:
result = {}
for i, datum in enumerate(line.split(';', 1)):
result[columns[i]] = datum.strip('"\n\r\t')
results.append(result) return resultsdf = pd.read_excel('keywords_list.xlsx')
df = df[df['Category'].str.contains('|'.join(categories))]
terms = df.Keyword# SERP Scrape
pos_count = 0
pos_correct = 0second_pass = []pos = list(range(1,11))
frames = []
for kw in tqdm(terms):
competitive_call = build_seo_urls(phrase=kw)
competitive_response = requests.get(competitive_call)
competitive_final = parse_response(call_data=competitive_response.content)
df2 = pd.DataFrame(competitive_final)
if len(df2) > 1:
df2['Position'] = pos
df2['Keyword'] = kw
frames.append(df2)
pos_correct += 1
else:
second_pass.append(kw)
pass
pos_count +=1print('Original Data Input = {}'.format(len(terms)))
print('Data Collected = {}'.format(pos_correct))
print("Keyword Data Collected via SEMRush: {}% ".format(pos_correct/pos_count*100.0))
df = pd.concat(frames)
df = df[['Keyword', "Domain", 'Url', 'Position']]
df.to_excel('serp_data.xlsx')
A few lines of code is all you need to elevate your next SEO campaign. While the examples shown above are pieces of code that I have implemented into my workflow, the ability to access information through SEMRush is only limited by your organization’s data needs. As you begin to further work within the API using the Python Wrapper, keep a tab open with the SEMRush API Documentation. While it doesn’t present comprehensive examples for each of its uses, it will give you the necessary inputs needed for your code to work.
Looking for more examples on how to use the SEMRush API? Reach out to me directly via LinkedIn or Twitter!