API Reference
POST /dedup
Send a full menu and get back clusters of items that refer to the same dish. Catches spelling errors, transliterations, promotional noise, and serving size variations. Returns groups of duplicates plus a list of singletons (unique items).
Request
| Parameter | Type | Required | Description |
|---|---|---|---|
items | string[] | Yes | Menu item texts to deduplicate. 1-2000 items, max 500 chars each. |
cosine_threshold | float | No | Similarity threshold (0.85 works well, lower = more aggressive grouping). Default: 0.85. |
{
"items": [
"Chicken Biryani",
"Murgh Biryani Serves 2",
"Paneer Tikka",
"Panner Tika",
"Masala Dosa",
"**NEW** Chiken Biryani (Half)"
]
}
Response
| Field | Type | Description |
|---|---|---|
clusters | object[] | Groups of duplicate items |
singletons | string[] | Items with no duplicates found |
total_items | int | Total items submitted |
duplicate_items | int | Number of excess duplicates (items in clusters minus number of clusters) |
processing_time_ms | float | Processing time in milliseconds |
Each cluster:
| Field | Type | Description |
|---|---|---|
cluster_id | int | Unique cluster identifier (1-based) |
canonical | string | Shortest member, recommended as the canonical name |
members | string[] | All items in this duplicate group (original text) |
pairwise_scores | object[] | Similarity scores between each pair of members |
{
"clusters": [
{
"cluster_id": 1,
"canonical": "Chicken Biryani",
"members": ["Chicken Biryani", "Murgh Biryani Serves 2", "**NEW** Chiken Biryani (Half)"],
"pairwise_scores": [
{"text_a": "Chicken Biryani", "text_b": "Murgh Biryani Serves 2", "score": 0.932841},
{"text_a": "Chicken Biryani", "text_b": "**NEW** Chiken Biryani (Half)", "score": 0.951203},
{"text_a": "Murgh Biryani Serves 2", "text_b": "**NEW** Chiken Biryani (Half)", "score": 0.910547}
]
},
{
"cluster_id": 2,
"canonical": "Paneer Tikka",
"members": ["Paneer Tikka", "Panner Tika"],
"pairwise_scores": [
{"text_a": "Paneer Tikka", "text_b": "Panner Tika", "score": 0.961482}
]
}
],
"singletons": ["Masala Dosa"],
"total_items": 6,
"duplicate_items": 3,
"processing_time_ms": 187.4
}
Example
import requests
menu = [
"Chicken Biryani",
"Murgh Biryani Serves 2",
"Paneer Tikka",
"Panner Tika",
"Masala Dosa",
"**NEW** Chiken Biryani (Half)"
]
response = requests.post("https://dish-embed.latimal.com/dedup",
headers={"X-API-Key": "YOUR_KEY", "Content-Type": "application/json"},
json={"items": menu}
)
data = response.json()
print(f"Found {len(data['clusters'])} duplicate groups, {data['duplicate_items']} excess items")
for cluster in data["clusters"]:
print(f"\n Canonical: {cluster['canonical']}")
print(f" Duplicates: {cluster['members']}")
Cost
1.0 credits per item.
Try it live — Test this endpoint in the interactive playground.
For a complete integration walkthrough, see the Menu Deduplication guide.
Notes
duplicate_itemscounts excess items (total items in clusters minus number of clusters), not total clustered items. A cluster of 3 items = 2 excess duplicates.- The
canonicalsuggestion picks the shortest member. You may want to apply your own logic (e.g. prefer items without noise or misspellings). - For menus over 2000 items, split into batches by category or restaurant and deduplicate each batch.
- Promotional text (prices, "NEW", serving sizes) is stripped before comparison, so "Chicken Biryani" and "BEST SELLER Chicken Biryani Rs. 299" will match.
POST /match
POST /match decides if menu item pairs are the same dish. Handles misspellings, transliterations, noise, and language differences. Up to 100 pairs per request.
POST /classify
POST /classify predicts one of 19 cuisine labels per menu item, with a confidence score. Accepts up to 512 items per request, max 500 chars each.