ANALYZE and TABULATE for Statistical Analysis

  • Updated
Download Icon Download

With ANALYZE, you can easily extract up to 50,000 terms from an answer set for statistical analysis.

Use the DISPLAY command to view your analyzed terms as a one-dimensional list.

Use the TABULATE command to correlate analyzed terms from two fields.

TABULATE is used mainly in patent and engineering databases. All extraction fields, except patent number, application number, priority number, CAS Registry Number®, accession number, Basic Index, and similar fields, may be used.

Example Use Cases

Example 1: Use ANALYZE to find top authors publishing on artificial collagen

  1.  Enter and search in the appropriate databases.
  2. (Optional) Eliminate duplicates.
  3. Enter ANALYZE (ANA) and the L-number of the answer set (L5), answers to analyze (1-), and field code to analyze (AU).

    page 1.png
  4. Display the 10 most frequently occurring terms.

    page 2.png

Display Options After ANALYZE

Option Definition Example
Default Displays the 10 most frequently occurring terms in the current sort order = > D
Sorting Change sort to alphabetical order = > DISPLAY ALPHA
  Change sort to occurrence (occ) order (default) = > DISPLAY OCC
Terms to Display Display all of the terms, in the current sort order = > DISPLAY ENTIRE
= > DISPLAY 1-
  Display the top “n” terms, in the current sort order = > DISPLAY TOP 20
  Display a range or number of terms, in the current sort order = > DISPLAY 1-20
  Display terms with occurrence/document counts greater than “n”, in the current sort order = > DISPLAY OGT 5
= > DISPLAY DGT 5
  Display terms with percentage counts greater than “n”, in the current sort order = > DISPLAY %GT 10
= > DISPLAY PGT 10
  Display only terms containing a specified character string, in quotes = > DISPLAY WITH “US”
  Display only terms lacking a specified character string, in quotes = > DISPLAY NOT “US”
Additional requirements Display the answer numbers with the terms = > DISPLAY ANS
  Display the terms in the delimited format, for post-processing => DISPLAY DELIMITED
  Display results with field code appended = > DISPLAY DETAIL

Example 2: Use TABULATE to find companies patenting in the area of medicinal compositions derived from members of the ginseng family.  

  1. Enter the database and search the IPC code of therapeutic interest.
  2. With the answer set (L1), ANALYZE two fields (PA and PY).
  3. Enter TABULATE and the ANALYZE L-number.
    a. Choose the GRID format
    b. Enter primary and secondary display codes and sort requirements.
    c. Enter Y to proceed
    d. Click on View all hyperlink (Not shown)

    page 3.png

The top 10 patent assignees (PA) are listed on the vertical axis; the publication years (PY) are listed on the horizontal axis. The document count is shown at each intersection of PA and PY.

Example 3: Using the ANALYZE tool in STNext

  1. Choose a file and conduct a search, and then look at the results in the Search History tab. Click the Analyze icon:

    page 4.png

  2. The Analyze window opens. Up to 50,000 records can be analyzed. The number of answers to be analyzed is prepopulated with the the number of records from your answer set. You can customize the number of records to analyze (e.g., 1-100), or which records (e.g., 1000-1822). Select up to two fields to analyze. Then click the Analyze button.

    page 4 2nd.png

  3. Manage your analysis results by:
    1. Changing the sort order of your data, or switching the Axes
    2. Downloading an Excel format in.csv format (recommended)
    3. Navigating through your chart using slider bars

      page 5.png

Solution images are included for illustrative purposes only. Your experience may vary based on recent enhancements or product license.