:
Sending an unknown URI to the API endpoint v3/url
- Send unknown URI to api.hyscore.io/v3/url
Provided the URI's domain isn't known to be problematic and as such blocked, it's sent to our analytics pipelines.
- Wait.
The waiting time depends on several factors. If you are using the "live" pipeline (depending on your use case), URIs are crawled as fast as they come in, so a result should be available in average after ~1 second.
While this is the desired outcome, there are caveats. If you are sending a lot of URIs from the same domain at the same time, it is very likely that this domain will block requests from hyscore because it assumes a DDOS attack if it gets several dozen requests a second from the same IP, making a proper analysis impossible. Please read the article "Known limitation and restriction of the hyScore service".
That is why we have the "batch" pipeline. In this case, requests to a certain domain are throttled to 10 per second per crawler instance. Instances scale with load, so ideally we still process more than 10 requests per second but it is possible that a URI takes longer than on the "live" pipeline, but there will be a proper result. The maximum processing time we experienced was about ~10 minutes.
If you're analyzing
- Call the API again with the same URI and get your result.
Here two things can be returned:- Your result. All is well.
- A message that the URI is still processing. If you are on the "live" pipeline, please try again after 1-2 seconds, if you are on "batch" after 30 seconds.
- If a result is not provided within 15 minutes you should queue it and request it later again if feasible for you and your use case.
As you can see the number and domain-diversity of the requests you send us massively influences the time it takes for you to get your results.
Comments
0 comments
Please sign in to leave a comment.