Cloud Natural Language API . documents

Instance Methods

analyzeEntities(body=None, x__xgafv=None)

Finds named entities (currently proper names and common nouns) in the text along with entity types, salience, mentions for each entity, and other properties.

analyzeEntitySentiment(body=None, x__xgafv=None)

Finds entities, similar to AnalyzeEntities in the text and analyzes sentiment associated with each entity and its mentions.

analyzeSentiment(body=None, x__xgafv=None)

Analyzes the sentiment of the provided text.

analyzeSyntax(body=None, x__xgafv=None)

Analyzes the syntax of the text and provides sentence boundaries and tokenization along with part of speech tags, dependency trees, and other properties.

annotateText(body=None, x__xgafv=None)

A convenience method that provides all the features that analyzeSentiment, analyzeEntities, and analyzeSyntax provide in one call.

classifyText(body=None, x__xgafv=None)

Classifies a document into categories.

close()

Close httplib2 connections.

Method Details

analyzeEntities(body=None, x__xgafv=None)
Finds named entities (currently proper names and common nouns) in the text along with entity types, salience, mentions for each entity, and other properties.

Args:
  body: object, The request body.
    The object takes the form of:

{ # The entity analysis request message.
  "document": { # ################################################################ # Represents the input to API methods. # Required. Input document.
    "content": "A String", # The content of the input in string format. Cloud audit logging exempt since it is based on user data.
    "gcsContentUri": "A String", # The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucket_name/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris. NOTE: Cloud Storage object versioning is not supported.
    "language": "A String", # The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted. [Language Support](https://cloud.google.com/natural-language/docs/languages) lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an `INVALID_ARGUMENT` error is returned.
    "type": "A String", # Required. If the type is not set or is `TYPE_UNSPECIFIED`, returns an `INVALID_ARGUMENT` error.
  },
  "encodingType": "A String", # The encoding type used by the API to calculate offsets.
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # The entity analysis response message.
  "entities": [ # The recognized entities in the input document.
    { # Represents a phrase in the text that is a known entity, such as a person, an organization, or location. The API associates information, such as salience and mentions, with entities.
      "mentions": [ # The mentions of this entity in the input document. The API currently supports proper noun mentions.
        { # Represents a mention for an entity in the text. Currently, proper noun mentions are supported.
          "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeEntitySentiment or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the sentiment expressed for this mention of the entity in the provided document.
            "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
            "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
          },
          "text": { # Represents an output piece of text. # The mention text.
            "beginOffset": 42, # The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.
            "content": "A String", # The content of the output text.
          },
          "type": "A String", # The type of the entity mention.
        },
      ],
      "metadata": { # Metadata associated with the entity. For most entity types, the metadata is a Wikipedia URL (`wikipedia_url`) and Knowledge Graph MID (`mid`), if they are available. For the metadata associated with other entity types, see the Type table below.
        "a_key": "A String",
      },
      "name": "A String", # The representative name for the entity.
      "salience": 3.14, # The salience score associated with the entity in the [0, 1.0] range. The salience score for an entity provides information about the importance or centrality of that entity to the entire document text. Scores closer to 0 are less salient, while scores closer to 1.0 are highly salient.
      "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeEntitySentiment or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the aggregate sentiment expressed for this entity in the provided document.
        "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
        "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
      },
      "type": "A String", # The entity type.
    },
  ],
  "language": "A String", # The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.
}
analyzeEntitySentiment(body=None, x__xgafv=None)
Finds entities, similar to AnalyzeEntities in the text and analyzes sentiment associated with each entity and its mentions.

Args:
  body: object, The request body.
    The object takes the form of:

{ # The entity-level sentiment analysis request message.
  "document": { # ################################################################ # Represents the input to API methods. # Required. Input document.
    "content": "A String", # The content of the input in string format. Cloud audit logging exempt since it is based on user data.
    "gcsContentUri": "A String", # The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucket_name/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris. NOTE: Cloud Storage object versioning is not supported.
    "language": "A String", # The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted. [Language Support](https://cloud.google.com/natural-language/docs/languages) lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an `INVALID_ARGUMENT` error is returned.
    "type": "A String", # Required. If the type is not set or is `TYPE_UNSPECIFIED`, returns an `INVALID_ARGUMENT` error.
  },
  "encodingType": "A String", # The encoding type used by the API to calculate offsets.
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # The entity-level sentiment analysis response message.
  "entities": [ # The recognized entities in the input document with associated sentiments.
    { # Represents a phrase in the text that is a known entity, such as a person, an organization, or location. The API associates information, such as salience and mentions, with entities.
      "mentions": [ # The mentions of this entity in the input document. The API currently supports proper noun mentions.
        { # Represents a mention for an entity in the text. Currently, proper noun mentions are supported.
          "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeEntitySentiment or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the sentiment expressed for this mention of the entity in the provided document.
            "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
            "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
          },
          "text": { # Represents an output piece of text. # The mention text.
            "beginOffset": 42, # The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.
            "content": "A String", # The content of the output text.
          },
          "type": "A String", # The type of the entity mention.
        },
      ],
      "metadata": { # Metadata associated with the entity. For most entity types, the metadata is a Wikipedia URL (`wikipedia_url`) and Knowledge Graph MID (`mid`), if they are available. For the metadata associated with other entity types, see the Type table below.
        "a_key": "A String",
      },
      "name": "A String", # The representative name for the entity.
      "salience": 3.14, # The salience score associated with the entity in the [0, 1.0] range. The salience score for an entity provides information about the importance or centrality of that entity to the entire document text. Scores closer to 0 are less salient, while scores closer to 1.0 are highly salient.
      "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeEntitySentiment or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the aggregate sentiment expressed for this entity in the provided document.
        "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
        "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
      },
      "type": "A String", # The entity type.
    },
  ],
  "language": "A String", # The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.
}
analyzeSentiment(body=None, x__xgafv=None)
Analyzes the sentiment of the provided text.

Args:
  body: object, The request body.
    The object takes the form of:

{ # The sentiment analysis request message.
  "document": { # ################################################################ # Represents the input to API methods. # Required. Input document.
    "content": "A String", # The content of the input in string format. Cloud audit logging exempt since it is based on user data.
    "gcsContentUri": "A String", # The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucket_name/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris. NOTE: Cloud Storage object versioning is not supported.
    "language": "A String", # The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted. [Language Support](https://cloud.google.com/natural-language/docs/languages) lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an `INVALID_ARGUMENT` error is returned.
    "type": "A String", # Required. If the type is not set or is `TYPE_UNSPECIFIED`, returns an `INVALID_ARGUMENT` error.
  },
  "encodingType": "A String", # The encoding type used by the API to calculate sentence offsets.
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # The sentiment analysis response message.
  "documentSentiment": { # Represents the feeling associated with the entire text or entities in the text. # The overall sentiment of the input document.
    "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
    "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
  },
  "language": "A String", # The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.
  "sentences": [ # The sentiment for all the sentences in the document.
    { # Represents a sentence in the input document.
      "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeSentiment or if AnnotateTextRequest.Features.extract_document_sentiment is set to true, this field will contain the sentiment for the sentence.
        "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
        "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
      },
      "text": { # Represents an output piece of text. # The sentence text.
        "beginOffset": 42, # The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.
        "content": "A String", # The content of the output text.
      },
    },
  ],
}
analyzeSyntax(body=None, x__xgafv=None)
Analyzes the syntax of the text and provides sentence boundaries and tokenization along with part of speech tags, dependency trees, and other properties.

Args:
  body: object, The request body.
    The object takes the form of:

{ # The syntax analysis request message.
  "document": { # ################################################################ # Represents the input to API methods. # Required. Input document.
    "content": "A String", # The content of the input in string format. Cloud audit logging exempt since it is based on user data.
    "gcsContentUri": "A String", # The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucket_name/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris. NOTE: Cloud Storage object versioning is not supported.
    "language": "A String", # The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted. [Language Support](https://cloud.google.com/natural-language/docs/languages) lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an `INVALID_ARGUMENT` error is returned.
    "type": "A String", # Required. If the type is not set or is `TYPE_UNSPECIFIED`, returns an `INVALID_ARGUMENT` error.
  },
  "encodingType": "A String", # The encoding type used by the API to calculate offsets.
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # The syntax analysis response message.
  "language": "A String", # The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.
  "sentences": [ # Sentences in the input document.
    { # Represents a sentence in the input document.
      "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeSentiment or if AnnotateTextRequest.Features.extract_document_sentiment is set to true, this field will contain the sentiment for the sentence.
        "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
        "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
      },
      "text": { # Represents an output piece of text. # The sentence text.
        "beginOffset": 42, # The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.
        "content": "A String", # The content of the output text.
      },
    },
  ],
  "tokens": [ # Tokens, along with their syntactic information, in the input document.
    { # Represents the smallest syntactic building block of the text.
      "dependencyEdge": { # Represents dependency parse tree information for a token. (For more information on dependency labels, see http://www.aclweb.org/anthology/P13-2017 # Dependency tree parse for this token.
        "headTokenIndex": 42, # Represents the head of this token in the dependency tree. This is the index of the token which has an arc going to this token. The index is the position of the token in the array of tokens returned by the API method. If this token is a root token, then the `head_token_index` is its own index.
        "label": "A String", # The parse label for the token.
      },
      "lemma": "A String", # [Lemma](https://en.wikipedia.org/wiki/Lemma_%28morphology%29) of the token.
      "partOfSpeech": { # Represents part of speech information for a token. Parts of speech are as defined in http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdf # Parts of speech tag for this token.
        "aspect": "A String", # The grammatical aspect.
        "case": "A String", # The grammatical case.
        "form": "A String", # The grammatical form.
        "gender": "A String", # The grammatical gender.
        "mood": "A String", # The grammatical mood.
        "number": "A String", # The grammatical number.
        "person": "A String", # The grammatical person.
        "proper": "A String", # The grammatical properness.
        "reciprocity": "A String", # The grammatical reciprocity.
        "tag": "A String", # The part of speech tag.
        "tense": "A String", # The grammatical tense.
        "voice": "A String", # The grammatical voice.
      },
      "text": { # Represents an output piece of text. # The token text.
        "beginOffset": 42, # The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.
        "content": "A String", # The content of the output text.
      },
    },
  ],
}
annotateText(body=None, x__xgafv=None)
A convenience method that provides all the features that analyzeSentiment, analyzeEntities, and analyzeSyntax provide in one call.

Args:
  body: object, The request body.
    The object takes the form of:

{ # The request message for the text annotation API, which can perform multiple analysis types (sentiment, entities, and syntax) in one call.
  "document": { # ################################################################ # Represents the input to API methods. # Required. Input document.
    "content": "A String", # The content of the input in string format. Cloud audit logging exempt since it is based on user data.
    "gcsContentUri": "A String", # The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucket_name/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris. NOTE: Cloud Storage object versioning is not supported.
    "language": "A String", # The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted. [Language Support](https://cloud.google.com/natural-language/docs/languages) lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an `INVALID_ARGUMENT` error is returned.
    "type": "A String", # Required. If the type is not set or is `TYPE_UNSPECIFIED`, returns an `INVALID_ARGUMENT` error.
  },
  "encodingType": "A String", # The encoding type used by the API to calculate offsets.
  "features": { # All available features for sentiment, syntax, and semantic analysis. Setting each one to true will enable that specific analysis for the input. # Required. The enabled features.
    "classifyText": True or False, # Classify the full document into categories.
    "extractDocumentSentiment": True or False, # Extract document-level sentiment.
    "extractEntities": True or False, # Extract entities.
    "extractEntitySentiment": True or False, # Extract entities and their associated sentiment.
    "extractSyntax": True or False, # Extract syntax information.
  },
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # The text annotations response message.
  "categories": [ # Categories identified in the input document.
    { # Represents a category returned from the text classifier.
      "confidence": 3.14, # The classifier's confidence of the category. Number represents how certain the classifier is that this category represents the given text.
      "name": "A String", # The name of the category representing the document, from the [predefined taxonomy](https://cloud.google.com/natural-language/docs/categories).
    },
  ],
  "documentSentiment": { # Represents the feeling associated with the entire text or entities in the text. # The overall sentiment for the document. Populated if the user enables AnnotateTextRequest.Features.extract_document_sentiment.
    "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
    "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
  },
  "entities": [ # Entities, along with their semantic information, in the input document. Populated if the user enables AnnotateTextRequest.Features.extract_entities.
    { # Represents a phrase in the text that is a known entity, such as a person, an organization, or location. The API associates information, such as salience and mentions, with entities.
      "mentions": [ # The mentions of this entity in the input document. The API currently supports proper noun mentions.
        { # Represents a mention for an entity in the text. Currently, proper noun mentions are supported.
          "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeEntitySentiment or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the sentiment expressed for this mention of the entity in the provided document.
            "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
            "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
          },
          "text": { # Represents an output piece of text. # The mention text.
            "beginOffset": 42, # The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.
            "content": "A String", # The content of the output text.
          },
          "type": "A String", # The type of the entity mention.
        },
      ],
      "metadata": { # Metadata associated with the entity. For most entity types, the metadata is a Wikipedia URL (`wikipedia_url`) and Knowledge Graph MID (`mid`), if they are available. For the metadata associated with other entity types, see the Type table below.
        "a_key": "A String",
      },
      "name": "A String", # The representative name for the entity.
      "salience": 3.14, # The salience score associated with the entity in the [0, 1.0] range. The salience score for an entity provides information about the importance or centrality of that entity to the entire document text. Scores closer to 0 are less salient, while scores closer to 1.0 are highly salient.
      "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeEntitySentiment or if AnnotateTextRequest.Features.extract_entity_sentiment is set to true, this field will contain the aggregate sentiment expressed for this entity in the provided document.
        "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
        "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
      },
      "type": "A String", # The entity type.
    },
  ],
  "language": "A String", # The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language. See Document.language field for more details.
  "sentences": [ # Sentences in the input document. Populated if the user enables AnnotateTextRequest.Features.extract_syntax.
    { # Represents a sentence in the input document.
      "sentiment": { # Represents the feeling associated with the entire text or entities in the text. # For calls to AnalyzeSentiment or if AnnotateTextRequest.Features.extract_document_sentiment is set to true, this field will contain the sentiment for the sentence.
        "magnitude": 3.14, # A non-negative number in the [0, +inf) range, which represents the absolute magnitude of sentiment regardless of score (positive or negative).
        "score": 3.14, # Sentiment score between -1.0 (negative sentiment) and 1.0 (positive sentiment).
      },
      "text": { # Represents an output piece of text. # The sentence text.
        "beginOffset": 42, # The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.
        "content": "A String", # The content of the output text.
      },
    },
  ],
  "tokens": [ # Tokens, along with their syntactic information, in the input document. Populated if the user enables AnnotateTextRequest.Features.extract_syntax.
    { # Represents the smallest syntactic building block of the text.
      "dependencyEdge": { # Represents dependency parse tree information for a token. (For more information on dependency labels, see http://www.aclweb.org/anthology/P13-2017 # Dependency tree parse for this token.
        "headTokenIndex": 42, # Represents the head of this token in the dependency tree. This is the index of the token which has an arc going to this token. The index is the position of the token in the array of tokens returned by the API method. If this token is a root token, then the `head_token_index` is its own index.
        "label": "A String", # The parse label for the token.
      },
      "lemma": "A String", # [Lemma](https://en.wikipedia.org/wiki/Lemma_%28morphology%29) of the token.
      "partOfSpeech": { # Represents part of speech information for a token. Parts of speech are as defined in http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdf # Parts of speech tag for this token.
        "aspect": "A String", # The grammatical aspect.
        "case": "A String", # The grammatical case.
        "form": "A String", # The grammatical form.
        "gender": "A String", # The grammatical gender.
        "mood": "A String", # The grammatical mood.
        "number": "A String", # The grammatical number.
        "person": "A String", # The grammatical person.
        "proper": "A String", # The grammatical properness.
        "reciprocity": "A String", # The grammatical reciprocity.
        "tag": "A String", # The part of speech tag.
        "tense": "A String", # The grammatical tense.
        "voice": "A String", # The grammatical voice.
      },
      "text": { # Represents an output piece of text. # The token text.
        "beginOffset": 42, # The API calculates the beginning offset of the content in the original document according to the EncodingType specified in the API request.
        "content": "A String", # The content of the output text.
      },
    },
  ],
}
classifyText(body=None, x__xgafv=None)
Classifies a document into categories.

Args:
  body: object, The request body.
    The object takes the form of:

{ # The document classification request message.
  "document": { # ################################################################ # Represents the input to API methods. # Required. Input document.
    "content": "A String", # The content of the input in string format. Cloud audit logging exempt since it is based on user data.
    "gcsContentUri": "A String", # The Google Cloud Storage URI where the file content is located. This URI must be of the form: gs://bucket_name/object_name. For more details, see https://cloud.google.com/storage/docs/reference-uris. NOTE: Cloud Storage object versioning is not supported.
    "language": "A String", # The language of the document (if not specified, the language is automatically detected). Both ISO and BCP-47 language codes are accepted. [Language Support](https://cloud.google.com/natural-language/docs/languages) lists currently supported languages for each API method. If the language (either specified by the caller or automatically detected) is not supported by the called API method, an `INVALID_ARGUMENT` error is returned.
    "type": "A String", # Required. If the type is not set or is `TYPE_UNSPECIFIED`, returns an `INVALID_ARGUMENT` error.
  },
}

  x__xgafv: string, V1 error format.
    Allowed values
      1 - v1 error format
      2 - v2 error format

Returns:
  An object of the form:

    { # The document classification response message.
  "categories": [ # Categories representing the input document.
    { # Represents a category returned from the text classifier.
      "confidence": 3.14, # The classifier's confidence of the category. Number represents how certain the classifier is that this category represents the given text.
      "name": "A String", # The name of the category representing the document, from the [predefined taxonomy](https://cloud.google.com/natural-language/docs/categories).
    },
  ],
}
close()
Close httplib2 connections.