How to merge two arrays in JavaScript and de-duplicate items
I have two JavaScript arrays:
var array1 = ["Vijendra","Singh"];
var array2 = ["Singh", "Shakya"];
I want the output开发者_Go百科 to be:
var array3 = ["Vijendra","Singh","Shakya"];
The output array should have repeated words removed.
How do I merge two arrays in JavaScript so that I get only the unique items from each array in the same order they were inserted into the original arrays?
To just merge the arrays (without removing duplicates)
ES5 version use Array.concat
:
var array1 = ["Vijendra", "Singh"];
var array2 = ["Singh", "Shakya"];
array1 = array1.concat(array2);
console.log(array1);
ES6 version use destructuring
const array1 = ["Vijendra","Singh"];
const array2 = ["Singh", "Shakya"];
const array3 = [...array1, ...array2];
Since there is no 'built in' way to remove duplicates (ECMA-262 actually has Array.forEach
which would be great for this), we have to do it manually:
Array.prototype.unique = function() {
var a = this.concat();
for(var i=0; i<a.length; ++i) {
for(var j=i+1; j<a.length; ++j) {
if(a[i] === a[j])
a.splice(j--, 1);
}
}
return a;
};
Then, to use it:
var array1 = ["Vijendra","Singh"];
var array2 = ["Singh", "Shakya"];
// Merges both arrays and gets unique items
var array3 = array1.concat(array2).unique();
This will also preserve the order of the arrays (i.e, no sorting needed).
Since many people are annoyed about prototype augmentation of Array.prototype
and for in
loops, here is a less invasive way to use it:
function arrayUnique(array) {
var a = array.concat();
for(var i=0; i<a.length; ++i) {
for(var j=i+1; j<a.length; ++j) {
if(a[i] === a[j])
a.splice(j--, 1);
}
}
return a;
}
var array1 = ["Vijendra","Singh"];
var array2 = ["Singh", "Shakya"];
// Merges both arrays and gets unique items
var array3 = arrayUnique(array1.concat(array2));
For those who are fortunate enough to work with browsers where ES5 is available, you can use Object.defineProperty
like this:
Object.defineProperty(Array.prototype, 'unique', {
enumerable: false,
configurable: false,
writable: false,
value: function() {
var a = this.concat();
for(var i=0; i<a.length; ++i) {
for(var j=i+1; j<a.length; ++j) {
if(a[i] === a[j])
a.splice(j--, 1);
}
}
return a;
}
});
With Underscore.js or Lo-Dash you can do:
console.log(_.union([1, 2, 3], [101, 2, 1, 10], [2, 1]));
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.15/lodash.min.js"></script>
http://underscorejs.org/#union
http://lodash.com/docs#union
First concatenate the two arrays, next filter out only the unique items:
var a = [1, 2, 3], b = [101, 2, 1, 10]
var c = a.concat(b)
var d = c.filter((item, pos) => c.indexOf(item) === pos)
console.log(d) // d is [1, 2, 3, 101, 10]
Edit
As suggested a more performance wise solution would be to filter out the unique items in b
before concatenating with a
:
var a = [1, 2, 3], b = [101, 2, 1, 10]
var c = a.concat(b.filter((item) => a.indexOf(item) < 0))
console.log(c) // c is [1, 2, 3, 101, 10]
[...array1,...array2] // => don't remove duplication
OR
[...new Set([...array1 ,...array2])]; // => remove duplication
This is an ECMAScript 6 solution using spread operator and array generics.
Currently it only works with Firefox, and possibly Internet Explorer Technical Preview.
But if you use Babel, you can have it now.
const input = [
[1, 2, 3],
[101, 2, 1, 10],
[2, 1]
];
const mergeDedupe = (arr) => {
return [...new Set([].concat(...arr))];
}
console.log('output', mergeDedupe(input));
Using a Set (ECMAScript 2015), it will be as simple as that:
const array1 = ["Vijendra", "Singh"];
const array2 = ["Singh", "Shakya"];
console.log(Array.from(new Set(array1.concat(array2))));
You can do it simply with ECMAScript 6,
var array1 = ["Vijendra", "Singh"];
var array2 = ["Singh", "Shakya"];
var array3 = [...new Set([...array1 ,...array2])];
console.log(array3); // ["Vijendra", "Singh", "Shakya"];
- Use the spread operator for concatenating the array.
- Use Set for creating a distinct set of elements.
- Again use the spread operator to convert the Set into an array.
Here is a slightly different take on the loop. With some of the optimizations in the latest version of Chrome, it is the fastest method for resolving the union of the two arrays (Chrome 38.0.2111).
http://jsperf.com/merge-two-arrays-keeping-only-unique-values
var array1 = ["Vijendra", "Singh"];
var array2 = ["Singh", "Shakya"];
var array3 = [];
var arr = array1.concat(array2),
len = arr.length;
while (len--) {
var itm = arr[len];
if (array3.indexOf(itm) === -1) {
array3.unshift(itm);
}
}
while loop: ~589k ops/s
filter: ~445k ops/s
lodash: 308k ops/s
for loops: 225k ops/s
A comment pointed out that one of my setup variables was causing my loop to pull ahead of the rest, because it didn't have to initialize an empty array to write to. I agree with that, so I've rewritten the test to even the playing field, and included an even faster option.
http://jsperf.com/merge-two-arrays-keeping-only-unique-values/52
let whileLoopAlt = function (array1, array2) {
const array3 = array1.slice(0);
let len1 = array1.length;
let len2 = array2.length;
const assoc = {};
while (len1--) {
assoc[array1[len1]] = null;
}
while (len2--) {
let itm = array2[len2];
if (assoc[itm] === undefined) { // Eliminate the indexOf call
array3.push(itm);
assoc[itm] = null;
}
}
return array3;
};
In this alternate solution, I've combined one answer's associative array solution to eliminate the .indexOf()
call in the loop which was slowing things down a lot with a second loop, and included some of the other optimizations that other users have suggested in their answers as well.
The top answer here with the double loop on every value (i-1) is still significantly slower. lodash is still doing strong, and I still would recommend it to anyone who doesn't mind adding a library to their project. For those who don't want to, my while loop is still a good answer and the filter answer has a very strong showing here, beating out all on my tests with the latest Canary Chrome (44.0.2360) as of this writing.
Check out Mike's answer and Dan Stocker's answer if you want to step it up a notch in speed. Those are by far the fastest of all results after going through almost all of the viable answers.
I simplified the best of this answer and turned it into a nice function:
function mergeUnique(arr1, arr2){
return arr1.concat(arr2.filter(function (item) {
return arr1.indexOf(item) === -1;
}));
}
The ES6 offers a single-line solution for merging multiple arrays without duplicates by using destructuring and set.
const array1 = ['a','b','c'];
const array2 = ['c','c','d','e'];
const array3 = [...new Set([...array1,...array2])];
console.log(array3); // ["a", "b", "c", "d", "e"]
Just throwing in my two cents.
function mergeStringArrays(a, b){
var hash = {};
var ret = [];
for(var i=0; i < a.length; i++){
var e = a[i];
if (!hash[e]){
hash[e] = true;
ret.push(e);
}
}
for(var i=0; i < b.length; i++){
var e = b[i];
if (!hash[e]){
hash[e] = true;
ret.push(e);
}
}
return ret;
}
This is a method I use a lot, it uses an object as a hashlookup table to do the duplicate checking. Assuming that the hash is O(1), then this runs in O(n) where n is a.length + b.length. I honestly have no idea how the browser does the hash, but it performs well on many thousands of data points.
Just steer clear of nested loops (O(n^2)), and .indexOf()
(+O(n)).
function merge(a, b) {
var hash = {};
var i;
for (i = 0; i < a.length; i++) {
hash[a[i]] = true;
}
for (i = 0; i < b.length; i++) {
hash[b[i]] = true;
}
return Object.keys(hash);
}
var array1 = ["Vijendra", "Singh"];
var array2 = ["Singh", "Shakya"];
var array3 = merge(array1, array2);
console.log(array3);
I know this question is not about array of objects, but searchers do end up here.
so it's worth adding for future readers a proper ES6 way of merging and then removing duplicates
array of objects:
var arr1 = [ {a: 1}, {a: 2}, {a: 3} ];
var arr2 = [ {a: 1}, {a: 2}, {a: 4} ];
var arr3 = arr1.concat(arr2.filter( ({a}) => !arr1.find(f => f.a == a) ));
// [ {a: 1}, {a: 2}, {a: 3}, {a: 4} ]
EDIT:
The first solution is the fastest only when there is few items. When there is over 400 items, the Set
solution becomes the fastest. And when there is 100,000 items, it is a thousand times faster than the first solution.
Considering that performance is important only when there is a lot of items, and that the Set
solution is by far the most readable, it should be the right solution in most cases
The perf results below were computed with a small number of items
Based on jsperf, the fastest way (edit: if there is less than 400 items) to merge two arrays in a new one is the following:
for (var i = 0; i < array2.length; i++)
if (array1.indexOf(array2[i]) === -1)
array1.push(array2[i]);
This one is 17% slower:
array2.forEach(v => array1.includes(v) ? null : array1.push(v));
This one is 45% slower (edit: when there is less than 100 items. It is a lot faster when there is a lot of items):
var a = [...new Set([...array1 ,...array2])];
And the accepted answers is 55% slower (and much longer to write) (edit: and it is several order of magnitude slower than any of the other methods when there is 100 000 items)
var a = array1.concat(array2);
for (var i = 0; i < a.length; ++i) {
for (var j = i + 1; j < a.length; ++j) {
if (a[i] === a[j])
a.splice(j--, 1);
}
}
https://jsperf.com/merge-2-arrays-without-duplicate
Array.prototype.merge = function(/* variable number of arrays */){
for(var i = 0; i < arguments.length; i++){
var array = arguments[i];
for(var j = 0; j < array.length; j++){
if(this.indexOf(array[j]) === -1) {
this.push(array[j]);
}
}
}
return this;
};
A much better array merge function.
Performance
Today 2020.10.15 I perform tests on MacOs HighSierra 10.13.6 on Chrome v86, Safari v13.1.2 and Firefox v81 for chosen solutions.
Results
For all browsers
- solution H is fast/fastest
- solutions L is fast
- solution D is fastest on chrome for big arrays
- solution G is fast on small arrays
- solution M is slowest for small arrays
- solutions E are slowest for big arrays
Details
I perform 2 tests cases:
- for 2 elements arrays - you can run it HERE
- for 10000 elements arrays - you can run it HERE
on solutions A, B, C, D, E, G, H, J, L, M presented in below snippet
// https://stackoverflow.com/a/10499519/860099
function A(arr1,arr2) {
return _.union(arr1,arr2)
}
// https://stackoverflow.com/a/53149853/860099
function B(arr1,arr2) {
return _.unionWith(arr1, arr2, _.isEqual);
}
// https://stackoverflow.com/a/27664971/860099
function C(arr1,arr2) {
return [...new Set([...arr1,...arr2])]
}
// https://stackoverflow.com/a/48130841/860099
function D(arr1,arr2) {
return Array.from(new Set(arr1.concat(arr2)))
}
// https://stackoverflow.com/a/23080662/860099
function E(arr1,arr2) {
return arr1.concat(arr2.filter((item) => arr1.indexOf(item) < 0))
}
// https://stackoverflow.com/a/28631880/860099
function G(arr1,arr2) {
var hash = {};
var i;
for (i = 0; i < arr1.length; i++) {
hash[arr1[i]] = true;
}
for (i = 0; i < arr2.length; i++) {
hash[arr2[i]] = true;
}
return Object.keys(hash);
}
// https://stackoverflow.com/a/13847481/860099
function H(a, b){
var hash = {};
var ret = [];
for(var i=0; i < a.length; i++){
var e = a[i];
if (!hash[e]){
hash[e] = true;
ret.push(e);
}
}
for(var i=0; i < b.length; i++){
var e = b[i];
if (!hash[e]){
hash[e] = true;
ret.push(e);
}
}
return ret;
}
// https://stackoverflow.com/a/1584377/860099
function J(arr1,arr2) {
function arrayUnique(array) {
var a = array.concat();
for(var i=0; i<a.length; ++i) {
for(var j=i+1; j<a.length; ++j) {
if(a[i] === a[j])
a.splice(j--, 1);
}
}
return a;
}
return arrayUnique(arr1.concat(arr2));
}
// https://stackoverflow.com/a/25120770/860099
function L(array1, array2) {
const array3 = array1.slice(0);
let len1 = array1.length;
let len2 = array2.length;
const assoc = {};
while (len1--) {
assoc[array1[len1]] = null;
}
while (len2--) {
let itm = array2[len2];
if (assoc[itm] === undefined) { // Eliminate the indexOf call
array3.push(itm);
assoc[itm] = null;
}
}
return array3;
}
// https://stackoverflow.com/a/39336712/860099
function M(arr1,arr2) {
const comp = f => g => x => f(g(x));
const apply = f => a => f(a);
const flip = f => b => a => f(a) (b);
const concat = xs => y => xs.concat(y);
const afrom = apply(Array.from);
const createSet = xs => new Set(xs);
const filter = f => xs => xs.filter(apply(f));
const dedupe = comp(afrom) (createSet);
const union = xs => ys => {
const zs = createSet(xs);
return concat(xs) (
filter(x => zs.has(x)
? false
: zs.add(x)
) (ys));
}
return union(dedupe(arr1)) (arr2)
}
// -------------
// TEST
// -------------
var array1 = ["Vijendra","Singh"];
var array2 = ["Singh", "Shakya"];
[A,B,C,D,E,G,H,J,L,M].forEach(f=> {
console.log(`${f.name} [${f([...array1],[...array2])}]`);
})
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.20/lodash.min.js" integrity="sha512-90vH1Z83AJY9DmlWa8WkjkV79yfS2n2Oxhsi2dZbIv0nC4E6m5AbH8Nh156kkM7JePmqD6tcZsfad1ueoaovww==" crossorigin="anonymous"></script>
This snippet only presents functions used in performance tests - it not perform tests itself!
And here are example test run for chrome
UPDATE
I remove cases F,I,K because they modify input arrays and benchmark gives wrong results
Why don't you use an object? It looks like you're trying to model a set. This won't preserve the order, however.
var set1 = {"Vijendra":true, "Singh":true}
var set2 = {"Singh":true, "Shakya":true}
// Merge second object into first
function merge(set1, set2){
for (var key in set2){
if (set2.hasOwnProperty(key))
set1[key] = set2[key]
}
return set1
}
merge(set1, set2)
// Create set from array
function setify(array){
var result = {}
for (var item in array){
if (array.hasOwnProperty(item))
result[array[item]] = true
}
return result
}
For ES6, just one line:
a = [1, 2, 3, 4]
b = [4, 5]
[...new Set(a.concat(b))] // [1, 2, 3, 4, 5]
The best solution...
You can check directly in the browser console by hitting...
Without duplicate
a = [1, 2, 3];
b = [3, 2, 1, "prince"];
a.concat(b.filter(function(el) {
return a.indexOf(el) === -1;
}));
With duplicate
["prince", "asish", 5].concat(["ravi", 4])
If you want without duplicate you can try a better solution from here - Shouting Code.
[1, 2, 3].concat([3, 2, 1, "prince"].filter(function(el) {
return [1, 2, 3].indexOf(el) === -1;
}));
Try on Chrome browser console
f12 > console
Output:
["prince", "asish", 5, "ravi", 4]
[1, 2, 3, "prince"]
My one and a half penny:
Array.prototype.concat_n_dedupe = function(other_array) {
return this
.concat(other_array) // add second
.reduce(function(uniques, item) { // dedupe all
if (uniques.indexOf(item) == -1) {
uniques.push(item);
}
return uniques;
}, []);
};
var array1 = ["Vijendra","Singh"];
var array2 = ["Singh", "Shakya"];
var result = array1.concat_n_dedupe(array2);
console.log(result);
There are so many solutions for merging two arrays. They can be divided into two main categories(except the use of 3rd party libraries like lodash or underscore.js).
a) combine two arrays and remove duplicated items.
b) filter out items before combining them.
Combine two arrays and remove duplicated items
Combining
// mutable operation(array1 is the combined array)
array1.push(...array2);
array1.unshift(...array2);
// immutable operation
const combined = array1.concat(array2);
const combined = [...array1, ...array2]; // ES6
Unifying
There are many ways to unifying an array, I personally suggest below two methods.
// a little bit tricky
const merged = combined.filter((item, index) => combined.indexOf(item) === index);
const merged = [...new Set(combined)];
Filter out items before combining them
There are also many ways, but I personally suggest the below code due to its simplicity.
const merged = array1.concat(array2.filter(secItem => !array1.includes(secItem)));
You can achieve it simply using Underscore.js's => uniq:
array3 = _.uniq(array1.concat(array2))
console.log(array3)
It will print ["Vijendra", "Singh", "Shakya"].
New solution ( which uses Array.prototype.indexOf
and Array.prototype.concat
):
Array.prototype.uniqueMerge = function( a ) {
for ( var nonDuplicates = [], i = 0, l = a.length; i<l; ++i ) {
if ( this.indexOf( a[i] ) === -1 ) {
nonDuplicates.push( a[i] );
}
}
return this.concat( nonDuplicates )
};
Usage:
>>> ['Vijendra', 'Singh'].uniqueMerge(['Singh', 'Shakya'])
["Vijendra", "Singh", "Shakya"]
Array.prototype.indexOf ( for internet explorer ):
Array.prototype.indexOf = Array.prototype.indexOf || function(elt)
{
var len = this.length >>> 0;
var from = Number(arguments[1]) || 0;
from = (from < 0) ? Math.ceil(from): Math.floor(from);
if (from < 0)from += len;
for (; from < len; from++)
{
if (from in this && this[from] === elt)return from;
}
return -1;
};
It can be done using Set.
var array1 = ["Vijendra","Singh"];
var array2 = ["Singh", "Shakya"];
var array3 = array1.concat(array2);
var tempSet = new Set(array3);
array3 = Array.from(tempSet);
//show output
document.body.querySelector("div").innerHTML = JSON.stringify(array3);
<div style="width:100%;height:4rem;line-height:4rem;background-color:steelblue;color:#DDD;text-align:center;font-family:Calibri" >
temp text
</div>
//Array.indexOf was introduced in javascript 1.6 (ECMA-262)
//We need to implement it explicitly for other browsers,
if (!Array.prototype.indexOf)
{
Array.prototype.indexOf = function(elt, from)
{
var len = this.length >>> 0;
for (; from < len; from++)
{
if (from in this &&
this[from] === elt)
return from;
}
return -1;
};
}
//now, on to the problem
var array1 = ["Vijendra","Singh"];
var array2 = ["Singh", "Shakya"];
var merged = array1.concat(array2);
var t;
for(i = 0; i < merged.length; i++)
if((t = merged.indexOf(i + 1, merged[i])) != -1)
{
merged.splice(t, 1);
i--;//in case of multiple occurrences
}
Implementation of indexOf
method for other browsers is taken from MDC
Array.prototype.add = function(b){
var a = this.concat(); // clone current object
if(!b.push || !b.length) return a; // if b is not an array, or empty, then return a unchanged
if(!a.length) return b.concat(); // if original is empty, return b
// go through all the elements of b
for(var i = 0; i < b.length; i++){
// if b's value is not in a, then add it
if(a.indexOf(b[i]) == -1) a.push(b[i]);
}
return a;
}
// Example:
console.log([1,2,3].add([3, 4, 5])); // will output [1, 2, 3, 4, 5]
array1.concat(array2).filter((value, pos, arr)=>arr.indexOf(value)===pos)
The nice thing about this one is performance and that you in general, when working with arrays, are chaining methods like filter, map, etc so you can add that line and it will concat and deduplicate array2 with array1 without needing a reference to the later one (when you are chaining methods you don't have), example:
someSource()
.reduce(...)
.filter(...)
.map(...)
// and now you want to concat array2 and deduplicate:
.concat(array2).filter((value, pos, arr)=>arr.indexOf(value)===pos)
// and keep chaining stuff
.map(...)
.find(...)
// etc
(I don't like to pollute Array.prototype and that would be the only way of respect the chain - defining a new function will break it - so I think something like this is the only way of accomplish that)
you can use new Set to remove duplication
[...new Set([...array1 ,...array2])]
A functional approach with ES2015
Following the functional approach a union
of two Array
s is just the composition of concat
and filter
. In order to provide optimal performance we resort to the native Set
data type, which is optimized for property lookups.
Anyway, the key question in conjunction with a union
function is how to treat duplicates. The following permutations are possible:
Array A + Array B
[unique] + [unique]
[duplicated] + [unique]
[unique] + [duplicated]
[duplicated] + [duplicated]
The first two permutations are easy to handle with a single function. However, the last two are more complicated, since you can't process them as long as you rely on Set
lookups. Since switching to plain old Object
property lookups would entail a serious performance hit the following implementation just ignores the third and fourth permutation. You would have to build a separate version of union
to support them.
// small, reusable auxiliary functions
const comp = f => g => x => f(g(x));
const apply = f => a => f(a);
const flip = f => b => a => f(a) (b);
const concat = xs => y => xs.concat(y);
const afrom = apply(Array.from);
const createSet = xs => new Set(xs);
const filter = f => xs => xs.filter(apply(f));
// de-duplication
const dedupe = comp(afrom) (createSet);
// the actual union function
const union = xs => ys => {
const zs = createSet(xs);
return concat(xs) (
filter(x => zs.has(x)
? false
: zs.add(x)
) (ys));
}
// mock data
const xs = [1,2,2,3,4,5];
const ys = [0,1,2,3,3,4,5,6,6];
// here we go
console.log( "unique/unique", union(dedupe(xs)) (ys) );
console.log( "duplicated/unique", union(xs) (ys) );
From here on it gets trivial to implement an unionn
function, which accepts any number of arrays (inspired by naomik's comments):
// small, reusable auxiliary functions
const uncurry = f => (a, b) => f(a) (b);
const foldl = f => acc => xs => xs.reduce(uncurry(f), acc);
const apply = f => a => f(a);
const flip = f => b => a => f(a) (b);
const concat = xs => y => xs.concat(y);
const createSet = xs => new Set(xs);
const filter = f => xs => xs.filter(apply(f));
// union and unionn
const union = xs => ys => {
const zs = createSet(xs);
return concat(xs) (
filter(x => zs.has(x)
? false
: zs.add(x)
) (ys));
}
const unionn = (head, ...tail) => foldl(union) (head) (tail);
// mock data
const xs = [1,2,2,3,4,5];
const ys = [0,1,2,3,3,4,5,6,6];
const zs = [0,1,2,3,4,5,6,7,8,9];
// here we go
console.log( unionn(xs, ys, zs) );
It turns out unionn
is just foldl
(aka Array.prototype.reduce
), which takes union
as its reducer. Note: Since the implementation doesn't use an additional accumulator, it will throw an error when you apply it without arguments.
DeDuplicate single or Merge and DeDuplicate multiple array inputs. Example below.
useing ES6 - Set, for of, destructuring
I wrote this simple function which takes multiple array arguments. Does pretty much the same as the solution above it just have more practical use case. This function doesn't concatenate duplicate values in to one array only so that it can delete them at some later stage.
SHORT FUNCTION DEFINITION ( only 9 lines )
/**
* This function merging only arrays unique values. It does not merges arrays in to array with duplicate values at any stage.
*
* @params ...args Function accept multiple array input (merges them to single array with no duplicates)
* it also can be used to filter duplicates in single array
*/
function arrayDeDuplicate(...args){
let set = new Set(); // init Set object (available as of ES6)
for(let arr of args){ // for of loops through values
arr.map((value) => { // map adds each value to Set object
set.add(value); // set.add method adds only unique values
});
}
return [...set]; // destructuring set object back to array object
// alternativly we culd use: return Array.from(set);
}
USE EXAMPLE CODEPEN:
// SCENARIO
let a = [1,2,3,4,5,6];
let b = [4,5,6,7,8,9,10,10,10];
let c = [43,23,1,2,3];
let d = ['a','b','c','d'];
let e = ['b','c','d','e'];
// USEAGE
let uniqueArrayAll = arrayDeDuplicate(a, b, c, d, e);
let uniqueArraySingle = arrayDeDuplicate(b);
// OUTPUT
console.log(uniqueArrayAll); // [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 43, 23, "a", "b", "c", "d", "e"]
console.log(uniqueArraySingle); // [4, 5, 6, 7, 8, 9, 10]
精彩评论