Keyword search techniques and ideas

2019.06.01 | search column

This article is based on what we researched at the time of writing.Please note that some information may differ from the latest information.

1. In the beginning

Searching using keywords is as familiar as Google search, and it is a search method that can be easily performed without using patent classification even in patent search.This paper introduces applied methods and ideas from basic keyword search methods that can be used mainly for patent searches and also for paper searches.

XNUMX. XNUMX.Basic keyword search approach

XNUMX Synonyms, synonyms

When searching for a specific item by keyword, it is essential to consider synonyms and synonyms.

 Example. (Synonyms / Synonyms) Ship + Boat + Ferry

XNUMX Abbreviations, allography, conversions

Since synonyms and synonyms alone cannot handle notational fluctuations, abbreviations, different notations, and conversions (Kanji-Hiragana-Katakana-Alphabet conversion / English name conversion) are used.

 Example. (Abbreviation) Personal computer + personal computer + PC
 Example. (Allography) Venice + Venice + Venice
 Example. (Conversion) Car + Car + Car + CAR

For compounds, there is also a conversion such as system name (IUPAC) -trivial name-CAS No.

XNUMX Patent terms, dialects, buzzwords

In particular, there are patent terms and terms (dialects) within a specific company that frequently appear in patent documents, and we will consider their use in the search.

 Example. (Patent term) Sliding + sliding
 Example. (Dialect) ○○○○ mechanism

* Since there is a risk that only specific companies will hit dialects, it is necessary to carefully consider whether or not to use them.In addition, it may be difficult to judge whether the phrase is a dialect or not, and even if it was originally a dialect in a niche industry, it may be generalized.

Depending on the field, there are buzzwords that frequently appear in a specific age group, so if you can use them well, it may be effective in investigating invalid materials.

XNUMX. XNUMX.Applied keyword search approach

XNUMX Hypernym / Hyponym

Next, consider hypernyms and hyponyms as an applied approach.

 Example. (Hyponyms / Hyponyms) Electronic components + semiconductors + IC chips

What I want to keep in mind here is that when listing hypernyms and hyponyms, it is necessary to develop meanings for each field under search.If it is a search in the field of electronic "equipment", "IC chip" will be used as a subordinate word of "semiconductor".On the other hand, in the field of electronic "materials", a specific compound name such as "gallium nitride" may be appropriate as a subordinate word of "semiconductor".When actually constructing a search formula, do not search for hypernyms and hyponyms together, but be aware of the hierarchy, such as synonyms and synonyms between hypernyms and synonyms and synonyms between hyponyms. Assembling it will lead to a good search.

XNUMX Neighborhood search

In addition, there are tools that can be used for neighborhood search in keyword search.In this neighborhood search, there are "short neighborhood" and "long neighborhood", and it is useful to use the number and order of characters in the neighborhood according to the purpose.

* Neighborhood search: A search method that specifies the distance and order between keywords.
* Short neighborhood: A short neighborhood search aimed at hitting compound words or short sentences (an image with a neighborhood distance of several to several tens)
* Long neighborhood: A long neighborhood search aimed at hits in the flow of sentences and contexts (images with a neighborhood distance of several tens to several hundreds)

 Example. (Short neighborhood) Living body [neighborhood 5] analysis
→ Neighborhood search with the intention of searching compound words such as "biological analysis" and "biological information analysis" and short sentences such as "... analyzing living organisms" and "... analyzing biological information." approach.

 Example. (Nearby) Judgment [Neighborhood 100] Alarm
→ A neighborhood search approach with the intention of searching the flow of context, such as "[3] ... Judgment (STEP4). [XNUMX] ... and an alarm is activated (STEPXNUMX)."

XNUMX Concept replacement

"Replacement of concepts based on meaning" is very effective in patent search.For example, if the reason for using a particular compound A is to produce a particular effect B, replacing the concept of "compound A ⇔ effect B" may produce good results.Similarly, it may be good to consider replacing the concept of "member C ⇔ function D".In parameter patents, it is possible not only to use keywords that directly refer to parameters, but also to indirectly search and hit the target parameters by considering units, standards, measuring instruments, and so on.

XNUMX.Other things to keep in mind

When considering keywords, some people may think that there is no end to how many to list.In such a case, I think you should keep the following in mind.

XNUMX Decide where to look for keyword information sources

Keywords include everyday terms, patent terms, and specific company terms (dialects), and it is not realistic to cover all of them perfectly.Therefore, it is a good idea to decide on an information source for selecting keywords and then select based on that source.For example, if you want to select synonyms and synonyms from everyday terms, you should select them from a general term dictionary, and select patent terms from the wording in the actual patent document or the explanation in the patent classification. Let's do it.It is a good idea to select industry terms and terms (dialects) within a specific company from dictionaries, industry books, websites, etc. of the industry.When making a selection, we recommend that you decide the time to look for it and start from the sources that seem to be important.

(Example of information source)

Figure XNUMX JST Thesaurus map Search term Gallium nitride

XNUMX Zipf's Law

There is a concept called "Zipf's law" for the appearance order and occupancy ratio of keywords.This is a rule of thumb that "the probability of a particular keyword appearing is proportional to the reciprocal of its frequency of appearance."For example, the reproducibility of keywords with the 20th most frequent occurrence is only 1% of that of the 5st place.In other words, it can be said that the efficiency of keywords decreases from the ones with high frequency to the ones with low frequency.Therefore,In keyword search, rather than finding and thinking about maniac keywords over time, it is possible to improve reproducibility efficiently by accumulating the ones with the highest appearance order.It is considered.If you keep this rule in mind, you won't spend too much time identifying keywords and you'll be less likely to stick to maniac keywords.

Figure XNUMX Zipf's law Cumulative distribution function (reproduced from wikipedia)

XNUMX Review the concept before making it into a keyword

So far, we have described keyword search methods and ideas, but there are still cases where keyword search does not work.In such a case, we recommend that you reconsider whether the concept you are using as a keyword is appropriate.When using a keyword, it is a good idea to ask yourself what (what) you want to search for with that keyword.It's a good idea to reconfirm that the keywords you just pulled from the dictionary can search and hit what you want, and that they are not making a lot of noise.Also, if possible, the concept to be keywordized should be a highly specific concept, such as a noun.

XNUMX.in conclusion

When actually searching for patents, it is likely that you will search using the patent classification.Therefore, it may be possible to search without scrutinizing the keywords.However, in order to make the search more sophisticated, it is important to combine and multiply patent classifications and keywords.Freely controlling the withdrawal of keyword selection leads to controlling the accuracy and reproducibility of search results.
We hope that you can refine your keyword search techniques and obtain better search results that serve your purpose.

Research Division Hashima

<Reference> (Reference date 2019/4/24)
-http://jglobal.jst.go.jp/
-https://en.wikipedia.org/wiki/Zipf%27s_law

Inquiry

For inquiries regarding IP research and inquiries about our business, please contact us.
Please feel free to contact us using this form.

Contact us.

Aztec Co., Ltd. search column

In this column, as a research company with strengths in patent search and technical analysis, we will deliver information that will be useful to everyone.For inquiries regarding this column and search requestsplease use this form.