Skip to main content

Gibberish

We define as gibberish any text that has the particularity of not being intelligible in the target language of the reader.

Unintelligible can be anything from a random sequence of characters, like asdasqweqdaczc, to a series of words that may be valid when analyzed one by one, but that in combination make no sense. For example: dog boat the yes.

The former is somewhat more easily detectable by computers but the latter is much harder as they are existing words that just happen to make no sense in combination.

Because of this, we introduced the concept of mild-gibberish, which covers the case of sentences having valid words with occurrences of gibberish in it.

Prediction labels

Input length

The longer the input, the better the model performs but we recommend paragraphs to be split into sentences of medium length.

Limits

The maximum length accepted is 512 characters.

LabelMeaningExample
normalThe model was not able to detect any gibberish inside the input textThis is a text being used as example
mild-gibberishSome gibberish was detected in the input, either by obvious garbage inside the text or by having erratic writing that makes little senseText that somewhat ?>! makes sense 123123asdad
gibberishThe text was evaluated as being completely unintelligibleqweqwe1!@@DAs vbxc?

Invokation

curl -L -G 'http://api.textkit.ai/detect/gibberish' \
--data-urlencode 'text=this is just a random text' \
--header 'X-API-Key: your_api_key_here'

Response

{
"prediction": "normal",
"confidence": "0.998",
"time_ms": 1409
}
FieldMeaning
predictionThe predicted label. See above for reference
confidenceValue between 0 and 1 that indicates how confident the model is
time_msTime in milliseconds the model took to predict the label. It does not account for the network round trip time between request and response