Skip to main content

Language

Our language detector uses SpaCy and FastText to predict what language the input text is written in.

Like many of our models, the longer the input the better the predictions are, especially if the text is written in a language that shares a recent common ancestor with another like Latin languages do with each other. For example, Spanish, Portuguese and Catalan.

Prediction labels

Limits

The maximum length accepted is 512 characters.

We return the 2-characters ISO_639-1 representing the language predicted, together with the language name written in English.

Example:

FieldValueMeaning
iso_codeptThe text was predicted to be in Portuguese
languagePortugueseLiteral representation of pt, written in English

Invokation

curl -L -G 'http://api.textkit.ai/detect/language' \
--data-urlencode 'text=this is just a random text' \
--header 'X-API-Key: your_api_key_here'

Response

{
"prediction": {
"iso_code": "en",
"language": "English"
},
"confidence": "0.982",
"time_ms": 16
}
FieldMeaning
predictionThe predicted label. See above for reference
confidenceValue between 0 and 1 that indicates how confident the model is
time_msTime in milliseconds the model took to predict the label. It does not account for the network round trip time between request and response