Javascript case and accent insensitive search
Finding a string inside another string in JavaScript is quite easy and can be achieved in different ways. However, depending on the use case (language, case sensitivity, accents, etc.), you may need a more appropriate solution.
Let's start with the basics, regardless of case or accents.
Basics
With the indexOf method
A quite straightforward and simple way. This method is case sensitive.
Return value: The index of the first occurrence of searchString found, or -1 if not found.
const str = 'Hello world';
str.indexOf('world'); // Match at index 6
const frStr = "J'ai déjà quelque chose de prévu";
frStr.indexOf('déjà'); // Match at index 5
const frStr = "J'ai déjà quelque chose de prévu";
frStr.indexOf('deja'); // No match ( -1 )
With the includes method
Simple as indexOf, also case sensitive.
const str = 'Hello world';
str.includes('world'); // Match, returns true
const frStr = "J'ai déjà quelque chose de prévu";
frStr.includes('déjà'); // Match, returns true
frStr.includes('deja'); // No match, returns false
With a Regex
The good old way. Case insensitive with the i flag.
const str = 'Hello world';
str.match(/hello/); // No match, returns null
str.match(/hello/i); // Match, returns [ 'Hello' ]
const frStr = "J'ai déjà quelque chose de prévu";
frStr.match(/déjà/i); // Match, returns [ 'déjà' ]
frStr.match(/deja/i); // No match, returns null
All these methods are quick and easy but a little bit perfectible. If you want to find an accented string, it's not enough.
Normalized strings comparison
A common way to compare strings is to normalize them. Converting both strings to lowercase is often enough to solve case-sensitivity issues, but it is only the first step.
const str = 'Hello world';
const search = 'hello';
str.toLocaleLowerCase().includes(search.toLocaleLowerCase()); // Match, returns true
const str = "J'ai déjà quelque chose de prévu";
const search = 'déjà';
const otherSearch = 'deja';
str.toLocaleLowerCase().includes(search.toLocaleLowerCase()); // Match, returns true
str.toLocaleLowerCase().includes(otherSearch.toLocaleLowerCase()); // No match, returns false
With a language like English, it's enough. However with a language like French, it's not. Let's normalize the strings a bit further by removing accents (diacritics) with the helper function below.
const removeDiacritics = (str: string) => {
return str.normalize('NFD').replace(/[\u0300-\u036f]/g, '');
};
This will remove all diacritics from the string using the normalize who returns the Unicode Normalization Form ( NFD ) and the replace to remove the diacritics. Easy like pie.
Let's get to a step further with another helper function.
const normalizeString = (str: string) => {
return removeDiacritics(str.toLocaleLowerCase().trim());
};
This will transform the string to lower case and remove all diacritics to make the comparison easier.
const str = "J'ai déjà quelque chose de prévu";
const search = 'déjà';
const otherSearch = 'deja';
normalizeString(str).includes(normalizeString(search)); // Match, returns true
normalizeString(str).includes(normalizeString(otherSearch)); // Match, returns true
We finally have a solution for comparing strings in a way that is both accent and case-insensitive. A must-have for many international applications.
Conclusion
In this article, we have seen how to compare a string with another string including accent and case insensitive. This is a crucial feature for any international application that can be displayed in multiple languages.
About the author
Front-end developer focused on React, Next.js, and clean, scalable CSS. Once building in PHP, Flash and others, now crafting layouts that (mostly) behave as expected. Greg's background in web design (Photoshop, Illustrator) shaped his love for clean layouts and CSS details. This blog is his way of giving back-sharing what he has learned from the same community that keeps inspiring him.
