Direct serach via html?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Direct serach via html?

Academic patent researcher
I am researching many patent documents and would like to be able to directly search from links in my documents and spreadsheets. I tried the following but it didn't work...

<a href="http://www.wipo.int/patentscope/search/en/search.jsf?queryString=ALLNUM:%PCT/US2006/0123456%">http://www.wipo.int/patentscope/search/en/search.jsf?queryString=ALLNUM:%PCT/US2006/0123456%

Is this possible to do?
Reply | Threaded
Open this post in threaded view
|

Re: Direct serach via html?

Iustin
Administrator
The correct URL for direct searches is the following one:
http://www.wipo.int/patentscope/search/en/result.jsf?query=ALLNUM:US20060123456
Reply | Threaded
Open this post in threaded view
|

Re: Direct serach via html?

")

Start = InStr(Start + 1, GetHTML, "/TD>")
Start = InStr(Start + 4, GetHTML, "/TD>")
Start = InStr(Start + 5, GetHTML, ">")
Last = InStr(Start + 1, GetHTML, "")
PCTNum = Mid(GetHTML, Start + 1, Last - Start - 1)

This works and one could follow a similar approach to extract all the other relevant information.

But it would be much nicer if instead of getting the HTML code using GetHTML = .responseText, it were possible to download an equivalent HTML document conforming to the DOM standard, so that the elements could be accessed by name.

I am not experienced in HTML, and my efforts so far to generate a DOM HTML document from the source code have proven unsuccessful. I am not sure whether this is because it's not supported or because of a lack of knowledge on my part.

Is there a simple way to do what I am trying to do?

Thanks


Jonathan
Jonathan Topper
Your response of July 16 is just what I was looking for. Thanks. Using this technique, I can get the HTML source code of a PCT application or WO publication using Microsoft's XML version 2.0 DLL i.e.

With CreateObject("Msxml2.XMLHTTP")
   .Open "GET", "http://www.wipo.int/patentscope/search/en/result.jsf?query=ALLNUM:" & srchField, False
   .Send
    GetHTML = .responseText
End With

where srchField is a suitably formatted PCT or WO number e.g. IL2006001320 derived from PCT/IL2006/001320.

This works well and I can then parse the HTML code to obtain all the details such as WO publication number, name of Applicant, priority details and so on. This requires quite a bit of effort to study the HTML source and to develop the code to extract the data. For example, to get the PCT number using VB, you could write:

Start = InStr(1, GetHTML, "International Application No.:
Reply | Threaded
Open this post in threaded view
|

Re: Direct serach via html?

Iustin
Administrator
We are working actively on a new feature which would render the result list in XML format and therefore much easier for parsing.
Iustin
Reply | Threaded
Open this post in threaded view
|

Re: Direct serach via html?

Chelo
Hello!
By looking a bit at the code of the search and also looking at the suggestions from this topic, I learned a bit about the URL based search, but I was not able to find any URL based search documentation. Is there any?
Thank you very much
Chelo
Reply | Threaded
Open this post in threaded view
|

Re: Direct search via html?

Jonathan Topper
In reply to this post by Iustin
I note that you allow XML status data to be downloaded. Is it possible yet to access this using a search string so that it can be parsed under VBA? If so, how is it done? If not, do you have any idea when it will be available?