当前位置: 首页 > 科技观察

面试官提问-如何去掉List集合中重复的元素-

时间:2023-03-21 13:44:35 科技观察

面试官问:如何去掉List集合中重复的元素?转载本文请联系Java极客技术公众号。一、问题的由来在实际开发中,我们经常会遇到这样的难点:一个集合容器中有很多重复的对象,而里面的对象又没有主键。但是,根据业务需要,我们实际上需要过滤掉任何重复的对象。比较暴力的方法是根据业务需求通过两层循环判断。如果没有重复元素,则将它们添加到新集合中,并跳过新集合中已有的元素。操作示例如下,创建一个实体对象PenBean,代码如下:/***penentity*/publicclassPenBean{/**type*/privateStringtype;/**color*/privateStringcolor;//...省略setter和getterpublicPenBean(Stringtype,Stringcolor){this.type=type;this.color=color;}@OverridepublicStringtoString(){return"PenBean{"+"type='"+type+'\''+",color='"+color+'\''+'}';}}测试demo,如下:publicstaticvoidmain(String[]args){//添加信息,PenBeanListpenBeanList=newArrayList();penBeanList.add(newPenBean("Pencil","black"));penBeanList.add(newPenBean("Pencil","white"));penBeanList.add(newPenBean("Pencil","black"));penBeanList.add(newPenBean("Pen","white"));penBeanList.add(newPenBean("中性笔","white"));//新建数据ListnewPenBeanList=newArrayList();//传统的重复判断for(PenBeanpenBean:penBeanList){if(newPenBeanList.isEmpty()){newPenBeanList.add(penBean);}else{booleanisSame=false;for(PenBeannewPenBean:newPenBeanList){//根据类型和颜色判断是否有重复元素//如果新集合中有元素,则跳过if(penBean.getType().equals(newPenBean.getType())&&penBean.getColor().equals(newPenBean.getColor())){isSame=true;break;}}if(!isSame){newPenBeanList.add(penBean);}}}//输出结果System.out.println("=========新数据======");for(PenBeanpenBean:newPenBeanList){System.out.println(penBean.toString());}}输出结果:=========新数据======PenBean{type='pencil',color='black'}PenBean{type='pencil',color='white'}PenBean{type='中性笔',color='white'一般在处理数组类型的对象时,可以使用该方法对数组元素进行去重,过滤掉不包含重复元素的数组有没有更简洁的写法?答案是肯定的,List中的contains()方法就是!其次,使用列表中的contains方法去重。在使用contains()之前,必须为PenBean类Method重写equals(),为什么要这样做呢?我们稍后会详细解释!我们首先重写PenBean类中的equals()方法,如下:@Overridepublicbooleanequals(Objecto){if(this==o)returntrue;if(o==null||getClass()!=o.getClass())returnfalse;PenBeanpenBean=(PenBean)o;//返回truereturnObjects.equals(type,penBean.type)&&Objects.equals(color,penBean.color);}修改测试demo如下:publicstaticvoidmain(String[]args){//添加信息ListpenBeanList=newArrayList();penBeanList.add(newPenBean("Pencil","black"));penBeanList.add(newPenBean("Pencil","white"));penBeanList.add(newPenBean("Pencil","black"));penBeanList.add(newPenBean("Gelpen","white"));penBeanList.add(newPenBean("gelpen","white"));//新建数据ListnewPenBeanList=newArrayList();//用contain判断是否有相同元素for(PenBeanpenBean:penBeanList){if(!newPenBeanList.contains(penBean)){newPenBeanList.add(penBean);}}//输出结果System.out.println("=========newdata======");for(PenBeanpenBean:newPenBeanList){System.out.println(penBean.toString());}}输出结果如下:=========新数据======PenBean{type='pencil',color='black'}PenBean{type='pencil',color='white'}PenBean{type='gelpen',color='white'}如果PenBean对象没有覆盖equals(),contains()方法都是假的!新数据与源数据相同,无法达到去除重复元素的目的。那么contains()是怎么做到的,判断一个集合中是否存在相同的元素呢?我们打开ArrayList中的contains()方法,源码如下:publicbooleancontains(Objecto){returnindexOf(o)>=0;}找到indexOf(o)方法,继续往下看,源码如下:publicintindexOf(Objecto){if(o==null){for(inti=0;ipenBeanList=newArrayList();penBeanList.add(newPenBean("Pencil","black"));penBeanList.add(newPenBean("pencil","white"));penBeanList.add(newPenBean("pencil","black"));penBeanList.add(newPenBean("gelpen","white"));penBeanList.add(newPenBean("gelpen","white"));//使用java8新特性流去重ListListnewPenBeanList=penBeanList.stream().distinct().collect(Collectors.toList());//输出结果System.out.println("=========新数据======");for(PenBeanpenBean:newPenBeanList){System.out.println(penBean.toString());}}使用jdk1.8中提供的Stream.distinct()列表去重,而Stream.distinct()使用hashCode()和equals()方法获取不同的元素,所以使用这种写法,对象需要重写hashCode()和equals()方法!重写PenBean对象的hashCode()方法,代码如下:@OverridepublicinthashCode(){returnObjects.hash(type,color);}运行测试demo,结果如下:=========新数据======PenBean{type='pencil',color='black'}PenBean{type='pencil',color='white'}PenBean{type='gelpen',color='white'}可以实现集合元素的去重操作!那为什么当我们使用String类型的对象作为收集元素的时候,没有重写呢?因为重写了java中的String原生类,所以源码如下:anObjectinstanceofString){StringanotherString=(String)anObject;intn=value.length;if(n==anotherString.value.length){charv1[]=value;charv2[]=anotherString.value;inti=0;while(n--!=0){if(v1[i]!=v2[i])returnfalse;i++;}returntrue;}}returnfalse;}@OverridepublicinthashCode(){inth=hash;if(h==0&&value.length>0){charval[]=value;for(inti=0;ipenBeanList=newArrayList();penBeanList.add(newPenBean("pencil","black"));penBeanList.add(newPenBean("pencil","white"));penBeanList.add(newPenBean("pencil","black"));penBeanList.add(newPenBean("gelpen","white"));penBeanList.add(newPenBean("gelpen","white"));//新数据ListnewPenBeanList=newArrayList();//设置去重HashSetset=newHashSet<>(penBeanList);newPenBeanList.addAll(set);//输出结果System.out.println("=========新数据======");for(PenBeanpenBean:newPenBeanList){System.out.println(penBean.toString());}}输出结果如下:=========新数据======PenBean{type='pencil',color='white'}PenBean{type='pencil',color='black'}PenBean{type='中性笔',color='white'}很详细,返回的newcollection没有重复元素!HashSet是如何制作的?打开HashSet的源码查看我们传入的构造方法如下:publicHashSet(Collectionc){map=newHashMap<>(Math.max((int)(c.size()/.75f)+1,16));addAll(c);}很明显,先创建一个HashMap对象,然后调用addAll()方法,继续往下看这个方法!publicbooleanaddAll(Collectionc){booleanmodified=false;for(Ee:c)if(add(e))modified=true;returnmodified;}先遍历List中的元素,然后调用add()方法,这个方法,源码如下:publicbooleanadd(Ee){returnmap.put(e,PRESENT)==null;}其实就是往HashMap对象中插入元素,这里的PRESENT是一个newObject()constant!privatestaticfinalObjectPRESENT=newObject();到这里基本就清楚了,往HashSet中添加元素其实等同于Map();map.put(e,newObject);//e表示要插入的元素。插入的元素e是HashMap中的key!我们知道HashMap是通过equals()和hashCode()来判断插入的key是否是同一个key。因此,当我们在PenBean对象上重写equals()和hashCode()时,就可以保证元素在判断为同一个key时进行去重。目的!最后将通过addAll()方法去重后的HashSet包裹在ArrayList中,得到我们想要的无重复元素的数据!