Programmatic URL Index Check via Google Indexing API

On this page

Why Automate URL Index Checks API Index Check Workflow API Endpoints and Response Diagnostics Worked Example: Checking 1,000 URLs with Quota Constraints Authentication Setup for Service Account Python Implementation Steps Quota Limits and Batch Processing Strategies FAQ

Field notes

Why Automate URL Index Checks

The Google Indexing API lets you programmatically notify Google about new or updated URLs and check their index status. For agencies and site owners managing thousands of pages, manually verifying indexing is impractical. The API replaces tedious spreadsheet checks with automated scripts that can run on a cron job.

In practice, when you monitor 50,000 backlinks or guest post pages weekly, you need a reliable system. The API returns a urlNotificationMetadata object with the latest notifyTime and latestUpdate timestamps. No timestamp means the URL was never submitted. You can then decide to submit it for indexing.

A common situation we see: a site has 10,000+ pages, but only 40% are indexed. The developer scripts a daily check using the API, filters URLs with no notifyTime, and submits them in batches. This workflow cuts manual effort by hours and surfaces blocked or excluded URLs fast.

Workflow map

API Index Check Workflow

Authenticate

Set up OAuth 2.0 service account with scope https://www.googleapis.com/auth/indexing

Request Status

POST to https://indexing.googleapis.com/v3/urlNotifications/metadata?url=ENCODED_URL

Parse Response

Extract latestUpdate and notifyTime. Null values mean no record.

Filter Unindexed

Separate URLs with missing notifyTime. These need a submission request.

Submit Batch

POST to https://indexing.googleapis.com/v3/urlNotifications:publish with URL_UPDATED type. Max 80 per batch.

Monitor Quota

Track used quota via response headers. Pause when daily limit of 200 is reached.

Data table

API Endpoints and Response Diagnostics

Endpoint	Method	Response Fields	Failure Mode
/urlNotifications/metadata Check index status	GET	`latestUpdate` timestamp `notifyTime` timestamp	404 means URL never submitted. 403 means auth error or quota exceeded.
/urlNotifications:publish Submit URL for indexing	POST	`urlNotificationMetadata` object with timestamps	400 Bad Request if URL invalid or type wrong. 429 Too Many Requests if quota exceeded or rate limited.
/urlNotifications:batch Not available for Indexing API	N/A	N/A	Batch endpoint does not exist. Must send individual requests. Use async concurrency with delays.
OAuth 2.0 token endpoint Generate access token	POST	`access_token` with expiry in seconds	Invalid grant if service account key is expired or misconfigured. Check `sub` field for impersonation.

Worked example

Worked Example: Checking 1,000 URLs with Quota Constraints

Assume you have a CSV with 1,000 backlink URLs to check. Your Indexing API daily quota is 200 requests. You cannot check all 1,000 in one day.

Step 1: Filter URLs to only those from domains with high priority (e.g., DA > 30). This reduces the list to 450 URLs.

Step 2: Run the check script for 200 URLs on Day 1. Parse responses: 120 have notifyTime (already submitted), 80 have no notifyTime. Submit those 80 via publish. That uses 80 of your 200 daily publish quota.

Step 3: Day 2: Check the next 200 URLs. 150 have notifyTime, 50 are new. Submit the 50. You have 120 publish slots left.

Step 4: Day 3: Check remaining 150 URLs. 100 have timestamps, 50 are new. Submit 50. All 1,000 URLs are now checked across 3 days, with 180 publish requests used total.

Edge case: 15 URLs returned HTTP 403 because the service account lacks scope for those sites. Those are logged separately for manual investigation.

Field notes

Authentication Setup for Service Account

You need a Google Cloud project with the Indexing API enabled. Create a service account and download the JSON key. Grant the service account the 'Owner' role on the Search Console property for each site you want to check. Without this, the API returns 403 errors.

The sub field in the JWT claim must be the verified owner email. We often see developers forget the sub field, leading to mysterious auth failures. Double-check your service account's delegation to the Search Console property.

For Python, use the google-auth and requests libraries. The access token expires after 3600 seconds. Refresh it programmatically before each batch. Store the token in memory, not on disk, to avoid stale credentials.

Python Implementation Steps

Load service account JSON key and create credentials with the indexing scope.
Generate an OAuth 2.0 access token using <code>google.auth.transport.requests</code>.
Read your URL list from CSV, database, or sitemap. Deduplicate to avoid wasting quota.
For each URL, call the metadata endpoint. Catch HTTP errors (403, 404, 429) and log them separately.
If latestUpdate is null, call the publish endpoint with URL_UPDATED type. Add a 2-second delay between requests to respect rate limits.
Track API response headers: <code>x-google-quota-used</code> and <code>x-google-quota-remaining</code>. Stop when remaining hits zero.
Write results to a CSV with columns: URL, notifyTime, latestUpdate, publishStatus, error.

Field notes

Quota Limits and Batch Processing Strategies

The Indexing API has a daily quota of 200 requests per service account for publish and 200 for metadata requests. These are separate quotas. You can check 200 URLs and submit 200 URLs per day from one account.

For large sites, you need multiple service accounts or distribute checks across days. A practical strategy: use one account for metadata checks and another for publish. Or stagger checks: Day 1 check URLs 1-200, Day 2 check 201-400, etc.

We often see developers hit the 429 rate limit because they send requests too fast. Add a delay of at least 1 second between requests. Use exponential backoff on 429 responses. Log each retry to avoid silent failures.

An edge case: blocked URLs (e.g., disallowed by robots.txt or requiring authentication) return 404 from the metadata endpoint, not a special error. You must cross-reference with your crawl logs to confirm blockage.

FAQ

google indexing api check url status for agencies managing multiple sites

Use one service account per Google Cloud project. Grant each site's Search Console property the service account email as owner. Loop through sites in your script, switching the base URL and auth context. Monitor total quota across all sites to avoid exceeding 200 daily requests per account. For 50+ sites, consider multiple projects with separate quotas.

how to check if backlinks are indexed using google indexing api

The API does not directly show if a backlink is indexed. Instead, check the target URL (the page that contains the backlink). If that page has a latestUpdate timestamp, it is indexed. For backlink verification, use the metadata endpoint on the linking page URL. Pair this with a crawl of the linking page to confirm the link exists. The API only tells you about the page, not the link itself.

google indexing api bulk check python script with csv

Write a Python script that reads a CSV with one column 'url'. For each row, call the metadata endpoint. Append results as new columns: notifyTime, latestUpdate, status. Use pandas for DataFrame operations. Handle 429 errors with time.sleep(60) and retry. Output a new CSV with index status. This script typically runs under 5 minutes for 200 URLs with 1-second delays.

google indexing api common errors and how to fix them

403: service account lacks Search Console owner role. Fix: verify ownership and add email. 400: invalid URL (must start with http or https). Fix: URL-encode and validate. 429: rate limit exceeded. Fix: add delay or reduce concurrency. 404: URL never submitted. Not an error, just means no record. 500: transient Google error. Retry with exponential backoff up to 3 times.

google indexing api quota limit per day for metadata requests

The quota is 200 metadata requests and 200 publish requests per service account per day. These are separate counters. Metadata requests check status; publish requests submit. You can check 200 URLs and submit 200 URLs daily. Quota resets at midnight Pacific Time. Check response headers x-google-quota-remaining to track usage.

google indexing api vs search console api for index coverage

Search Console API gives aggregate index coverage data (indexed, excluded, errors) per property. Indexing API gives per-URL notification and status. Use Search Console for dashboard-level checks. Use Indexing API for programmatic per-URL workflows. They complement each other. The Indexing API is faster for individual URL checks but has lower quota.

how to automate google indexing api workflow with cron job

Schedule a cron job daily that runs your Python script. Store the list of URLs to check in a database table with a 'last_checked' column. Each run picks up URLs that have not been checked in the last 7 days. Respect daily quota by limiting batch size to 200. Log results to a separate table for auditing. Use a lock file to prevent overlapping runs.

google indexing api alternatives for checking url index status

Alternatives: 1) Search Console API URL Inspection endpoint (rate-limited, but no quota per se). 2) Google Custom Search API (returns indexed status indirectly via search results). 3) Scraping Google search results with 'site:URL' (against ToS). The Indexing API is the only official, reliable, programmatic method. It is designed for notifications, not bulk checking, but works for small batches.

how to handle duplicate URLs in google indexing api batch

Deduplicate your URL list before processing. The API does not reject duplicates, but you waste quota. Use a Python set or pandas drop_duplicates(). Also check for URL variants: trailing slash vs no slash, http vs https, www vs non-www. Normalize all URLs to a canonical form (e.g., lowercase, https, no trailing slash) before checking. This reduces false negatives.

google indexing api metadata response fields explained

The metadata response contains two fields: latestUpdate (timestamp of last indexing notification, could be from a previous submit) and notifyTime (timestamp of the API notification you just triggered). If both are null, the URL was never submitted. If latestUpdate is present but notifyTime is null, the URL was submitted via sitemap or other means, not the API. Use latestUpdate to determine if the page is indexed.

Next reads

Related guides

↗

Main guide

↗

Bulk Check URL Index Status: Tools Compared

↗

Fix Indexation Issues: From Not Indexed to Indexed

↗

Check URL Index Status with Site Operator

Budget math

Estimate the cost of waiting

Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.

Expected monthly value, USD Average waiting time, days