Skip to content

Filter duplicate object in Flux reactor

  • by
Filter duplicate object in Flux reactor

1. Overview

In this article, we will learn to filter duplicate object in the Flux reactor.Flux object represents a reactive sequence of 0..N items, while a Mono object represents a single-value-or-empty (0..1) result. To know more about Spring Reactive, refer our articles.

2. Filter duplicate object in Flux reactor

The Flux provides the method distinct to filter out the duplicates in the whole sequence.

2.1. Filter duplicate values of Flux sequence

The distinct() no argument method takes all the elements of this Flux and filter out duplicates. It internally records the values into a HashSet for distinct detection.

For example, the following code contains a Flux that contains duplicate values.

@Test
public void filterDuplicates() {
    Flux<String> fluxCollection = Flux.just("Mike", "Ram", "Mike", "John", "Kevin", "Chris", "Kevin");
    fluxCollection.distinct()
       .subscribe(System.out::println);
}

If you execute the preceding test case, it prints the following :

Mike
Ram
John
Kevin
Chris

2.2. Filter duplicate objects using a field

It is often the case where you want to remove duplicates of your custom objects using a particular field. To do so, you can use the below distinct variant:

public final <V> Flux<T> distinct(Function<? super T, ? extends V> keySelector) 

This takes a key as input to compare and filter the objects . It returns the first occurrence and removes every other duplicates.

Assume many students are registering for the same event multiple times. So you want to remove duplicate entries and have only one registration entry per student.

Here, the code takes the studentId field as key for comparison and removes the duplicates. If the same student registers multiple times, we would take only first registration as valid and ignore subsequent registrations.

 @Test
    public void filterDuplicatesById() {

        Flux<Registration> fluxCollection = Flux.just(
                new Registration(12, "Mike", "Jackson", "Registration success"),
                new Registration(11, "Ram", "Charan", "Registration success"),
                new Registration(12, "Mike", "Jackson", "Registered again"),
                new Registration(11, "Ram", "Charan", "Registered again"));
        fluxCollection.distinct(Registration::getStudentId)
                .subscribe(System.out::println);
    }

If you execute the preceding code, then the following would be printed on the console:

Registration{userId=12, firstName='Mike', lastName='Jackson', comments='Registration success'}
Registration{userId=11, firstName='Ram', lastName='Charan', comments='Registration success'}

Note that this method returns the first occurrence and ignore others.

2.3. Filter duplicate objects using groupBy

The aforementioned distinct remove duplicate entries and return the first occurrence. What if you want to take the last occurrence or something else?

Assume you are receiving duplicate comments from the same user, and you want to pick only the latest comment based on last update and ignore the remaining duplicate comments.

You can use the following methods to remove duplicate objects by using custom condition:

  1. groupBy – Group comments by userId
  2. reduce – Reduce the group to a single object. It takes each group and keeps only the latest comment of each user.
  3. flatMap – Return Flux with unique values
public void filterDuplicatesGroupBy() {

        Flux<Comments> fluxCollection = Flux.just(
                new Comments(12, "Mike", "Jackson",
                        LocalDate.of(2021, 12, 31)),
                new Comments(11, "Ram", "Charan",
                        LocalDate.of(2021, 11, 3)),
                new Comments(12, "Mike", "Jackson",
                        LocalDate.of(2022, 1, 3)),
                new Comments(11, "Ram", "Charan",
                        LocalDate.of(2022, 1, 10)));
        fluxCollection
                .groupBy(Comments::getUserId)
                .flatMap(g -> g.reduce((a, b) -> a.getUpdateDate().compareTo(b.getUpdateDate()) >
                        0 ? a : b))
                .subscribe(System.out::println);
    }

If you execute the preceding code, then the following would be printed on the console:

Comments{userId=11, firstName='Ram', lastName='Charan', comments='2022-01-10'}
Comments{userId=12, firstName='Mike', lastName='Jackson', comments='2022-01-03'}

Note that this method returns the latest comment and ignore others.

3. Conclusion

To sum up, we have learned to filter duplicate object in the Flux reactor.

Leave a Reply

Your email address will not be published. Required fields are marked *