We're often facing the questions why a specific result occurs on a top-level domain (tld) or a single page URL. This article describes how the analysis is working from a high-level perspective with some examples.
How you use the hyScore API really depends on your individual use case and need. The system always provides a response and you've to decide how you'll handle the responses and how you process it on your side.
For a website owner who wants to analyze his own content, it is quite easy. If you have a more generalistic approach where you've millions of different domains and single URLs to analyze it can be tricky because we're always relying on the source of the content.
How it works
Pre-requisite
- Customer A wants to perform an analysis of a specific single URL to get valuable content insights.
- Single URL is unknown to hyScore.
- HyScore does not block the domain/TLD.
- The URL is sent to the API endpoint v3/url.
1. API request: New URL, analysis in progress
Customer A sends the URL http://www.example.com/?article=293020-something to the v3/url API endpoint.
The system is checking if we've seen the URL before. The single URL is unknown to hyScore.
{
"url": "http://www.example.com/?article=293020-something",
"status": {
"message": "New URL, analysis in progress",
"code": 0,
"type": "No data yet"
},
"tld": "www.example.com"
}
Advice: If the domain/URL has been moved on hyScores' TLD blacklist you would receive the response:
{
"url": "https://www.example.com/",
"status": {
"message": "URL blocked by hyScore",
"code": 98,
"type": "Error"
},
"tld": "www.example.com"
}
The blocked domains are mostly unimportant and useless subdomains (such as ad-serving or CDN URLs which provide no valuable content or information for analysis. If we get aware of such kind of domains, we'll put them on our black-list to provide the feedback/result that this kind of URL doesn't make sense to be analyzed.
2nd API request for the same URL within a reasonable period (5 sec to 1m)
When hyScore received the 1st API request for a new single URL it is forwarded to a process queue to crawl, extract and analyze the content.
The analysis of a new URL hyScore has never seen before takes in average ~1 second, before a full or partly result can be provided.
If the analysis was complete and successfully processed before the 2nd request appears you'll receive:
{
"status": {
"message": "All seems well.",
"code": "1",
"type": "Ok"
},
"sentiment": 0.3298329389,
"url": "http://www.example.com/?article=293020-something",
"result": "<result in here>"......, }
If you receive the following status message, then you should consider queueing that URL to gather the result to a later time or request it until the result is fully available.
{
"url": "http://www.example.com/?article=293020-something",
"status": {
"message": "Analysis in progress",
"code": "9",
"type": "Incomplete"
},
"result": {},
"tld": "www.example.com"
}
For the 3rd, 4th and any following request made for the same single URL the system provides the result within an average of 100 milliseconds.
To take into consideration
API request for the same single URL within a shorter period ( 1 ms to 5 sec) -
As mentioned above: the analysis of a new URL hyScore never has seen before takes in average ~1 second, before a full or partly result can be provided.
Advice: hyScore can only provide high-quality results from sources/websites whose quality, performance and content are subject to specific standards. The service may be subject to restrictions on individual TLD and website structures. Please find the current limitations of the service here: Known limitation and restriction of the hyScore service.
If you request the "new single URL" after some milliseconds again, after you've sent it the first time, it is possibly still in the analysis. In this case, you'll receive the following status message as long as it has finished the first analysis:
{
"url": "http://www.example.com/?article=293020-something",
"status": {
"message": "Analysis in progress",
"code": "9",
"type": "Incomplete"
},
"result": {},
"tld": "www.example.com"
}
It might be that you'll receive another "Incomplete" status if the analysis failed partially (e.g. categorization issue). In this case, we recommend to queue the URL on your side and try later to get the full result. If the categorization of a URL has failed in the first run, we'll try it automatically every time again the URL is requested.
"Incomplete" URLs should be also been queued to try to get the full result with a later request.
If the analysis was completed and successfully processed before the 2nd API request you'll receive:
{
"status": {
"message": "All seems well.",
"code": "1",
"type": "Ok"
},
"sentiment": 0.3298329389,
"url": "http://www.example.com/?article=293020-something",
"result": "<result in here>"......, }
To be continued ...
Recommended articles you should read:
Comments
0 comments
Article is closed for comments.