Implementing The Outbox-Pattern With Kafka – Part 1: Writing In The Outbox-Table

This article is part of a Blog-Series

Based on a very simple example-project
we will implemnt the Outbox-Pattern with Kafka.

TL;DR

In this part, we will implement the outbox (aka: the queueing of the messages in a database-table).

The Outbox Table

The outbox is represented by an additionall table in the database.
This table acts as a queue for messages, that should be send as part of the transaction.
Instead of sending the messages, the application stores them in the outbox-table.
The actual sending of the messages occures outside of the transaction.

Because the messages are read from the table outside of the transaction context, only entries related to sucessfully commited transactions are visible.
Hence, the sending of the message effectively becomes a part of the transaction.
It happens only, if the transaction was successfully completed.
Messages associated to an aborted transaction will not be send.

The Implementation

No special measures need to be taken when writing the messages to the table.
The only thing to be sure of is that the writing takes part in the transaction.

In our implementation, we simply store the serialized message, together with a key, that is needed for the partitioning of your data in Kafka, in case the order of the messages is important.
We also store a timestamp, that we plan to record as Event Time later.

One more thing that is worth noting is that we utilize the database to create an unique record-ID.
The generated unique and monotonically increasing id is required later, for the implementation of Exactly-Once semantics.

The SQL for the table looks like this:

CREATE TABLE outbox (
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  key VARCHAR(127),
  value varchar(1023),
  issued timestamp
);

Decoupling The Business Logic

In order to decouple the business logic from the implementation of the messaging mechanism, I have implemented a thin layer, that uses Spring Application Events to publish the messages.

Messages are send as a subclass of ApplicationEvent:

publisher.publishEvent(
  new UserEvent(
    this,
    username,
    CREATED,
    ZonedDateTime.now(clock)));

The event takes a key (username) and an object as value (an instance of an enum in our case).
An EventListener receives the events and writes them in the outbox table:

@TransactionalEventListener(phase = TransactionPhase.BEFORE_COMMIT)
public void onUserEvent(OutboxEvent event)
{
  try
  {
    repository.save(
        event.getKey(),
        mapper.writeValueAsString(event.getValue()),
        event.getTime());
  }
  catch (JsonProcessingException e)
  {
    throw new RuntimeException(e);
  }
}

The @TransactionalEventListener is not really needed here.
A normal EventListener would also suffice, because spring immediately executes all registered normal event listeners.
Therefore, the registered listeners would run in the same thread, that published the event, and participate in the existing transaction.

But if a @TransactionalEventListener is used, like in our example project, it is crucial, that the phase is switched to BEFORE_COMMIT when the Outbox Pattern is introduced.
This is, because the listener has to be executed in the same transaction context in which the event was published.
Otherwise, the writing of the messages would not be coupled to the success or abortion of the transaction, thus violating the idea of the pattern.

May The Source Be With You!

Since this part of the implementation only stores the messages in a normal database, it can be published as an independent component that does not require any dependencies on Kafka.
To highlight this, the implementation of this step does not use Kafka at all.
In a later step, we will separate the layer, that decouples the business code from our messaging logic in a separate package.

The complete source code of the example-project can be cloned here:

This version only includes the logic, that is needed to fill the outbox-tabel.
Reading the messages from this table and sending them through Kafka will be the topic of the next part of this blog-series.

The sources include a Setup for Docker Compose, that can be run without compiling
the project. And a runnable README.sh, that compiles and run the application and illustrates the example.

Implementing The Outbox-Pattern With Kafka – Part 0: The example

This article is part of a Blog-Series

Based on a very simple example-project
we will implemnt the Outbox-Pattern with Kafka.

TL;DR

In this part, a small example-project is introduced, that features a component, which has to inform another component upon every succsessfully completed operation.

The Plan

In this mini-series I will implement the Outbox-Pattern
as described on Chris Richardson’s fabolous website microservices.io.

The pattern enables you, to send a message as part of a database transaction in a reliable way, effectively turining the writing of the data
to the database and the sending of the message into an atomic operation:
either both operations are successful or neither.

The pattern is well known and implementing it with Kafka looks like an easy straight forward job at first glance.
However, there are many obstacles that easily lead to an incomplete or incorrect implementation.
In this blog-series, we will circumnavigate these obstacles together step by step.

The Example Project

To illustrate our implementation, we will use a simple example-project.
It mimics a part of the registration process for an web application:
a (very!) simplistic service takes registration orders for new users.

  • Successfull registration requests will return a 201 (Created), that carries the URI, under which the data of the newly registered user can be accessed in the Location-header:


    echo peter | http :8080/users
    
    HTTP/1.1 201 
    Content-Length: 0
    Date: Fri, 05 Feb 2021 14:44:51 GMT
    Location: http://localhost:8080/users/peter
    

  • Requests to registrate an already existing user will result in a 400 (Bad Request):

    echo peter | http :8080/users
    
    HTTP/1.1 400 
    Connection: close
    Content-Length: 0
    Date: Fri, 05 Feb 2021 14:44:53 GMT
    

  • Successfully registrated users can be listed:

    http :8080/users
    
    HTTP/1.1 200 
    Content-Type: application/json;charset=UTF-8
    Date: Fri, 05 Feb 2021 14:53:59 GMT
    Transfer-Encoding: chunked
    
    [
        {
            "created": "2021-02-05T10:38:32.301",
            "loggedIn": false,
            "username": "peter"
        },
        ...
    ]
    

The Messaging Use-Case

As our messaging use-case imagine, that there has to happen several processes after a successful registration of a new user.
This may be the generation of an invoice, some business analytics or any other lengthy process that is best carried out asynchronously.
Hence, we have to generate an event, that informs the responsible services about new registrations.

Obviously, these events should only be generated, if the registration is completed successfully.
The event must not be fired, if the registration is rejected, because a duplicate username.

On the other hand, the publication of the event must happen reliably, because otherwise, the new might not be charged for the services, we offer…

The Transaction

The users are stored in a database and the creation of a new user happens in a transaction.
A “brilliant” colleague came up with the idea, to trigger an IncorrectResultSizeDataAccessException to detect duplicate usernames:

User user = new User(username);
repository.save(user);
// Triggers an Exception, if more than one entry is found
repository.findByUsername(username);

The query for the user by its names triggers an IncorrectResultSizeDataAccessException, if more than one entry is found.
The uncaught exception will mark the transaction for rollback, hence, canceling the requested registration.
The 400-response is then generated by a corresponding ExceptionHandler:

@ExceptionHandler
public ResponseEntity incorrectResultSizeDataAccessException(
    IncorrectResultSizeDataAccessException e)
{
  LOG.info("User already exists!");
  return ResponseEntity.badRequest().build();
}

Please do not code this at home…

But his weired implementation perfectly illustrates the requirements for our messaging use-case:
The user is written into the database.
But the registration is not successfully completed until the transaction is commited.
If the transaction is rolled back, no message must be send, because no new user was registered.

Decoupling with Springs EventPublisher

In the example implementation I am using an EventPublisher to decouple the business logic from the implementation of the messaging.
The controller publishes an event, when a new user is registered:

publisher.publishEvent(new UserEvent(this, usernam));

A listener annotated with @TransactionalEventListener receives the events and handles the messaging:

@TransactionalEventListener
public void onUserEvent(UserEvent event)
{
    // Sending the message happens here...
}

In non-critical use-cases, it might be sufficient to actually send the message to Kafka right here.
Spring ensures, that the message of the listener is only called, if the transaction completes successfully.
But in the case of a failure this naive implementation can loose messages.
If the application crashes, after the transaction has completed, but before the message could be send, the event would be lost.

In the following blog posts, we will step by step implement a solution based on the Outbox-Pattern, that can guarantee Exactly-Once semantics for the send messages.

May The Source Be With You!

The complete source code of the example-project can be cloned here:

It includes a Setup for Docker Compose, that can be run without compiling
the project. And a runnable README.sh, that compiles and run the application and illustrates the example.

How To Instantiatiate Multiple Beans Dinamically in Spring-Boot Depending on Configuration-Properties

TL;DR

In this mini-HowTo I will show a way, how to instantiate multiple beans dinamically in Spring-Boot, depending on configuration-properties.
We will:

  • write a ApplicationContextInitializer to add the beans to the context, before it is refreshed
  • write a EnvironmentPostProcessor to access the configured configuration sources
  • register the EnvironmentPostProcessor with Spring-Boot

Write an ApplicationContextInitializer

Additionally Beans can be added programatically very easy with the help of an ApplicationContextInitializer:

@AllArgsConstructor
public class MultipleBeansApplicationContextInitializer
    implements
      ApplicationContextInitializer
{
  private final String[] sites;

  @Override
  public void initialize(ConfigurableApplicationContext context)
  {
    ConfigurableListableBeanFactory factory =
        context.getBeanFactory();
    for (String site : sites)
    {
      SiteController controller =
          new SiteController(site, "Descrition of site " + site);
      factory.registerSingleton("/" + site, controller);
    }
  }
}

This simplified example is configured with a list of strings that should be registered as controllers with the DispatcherServlet.
All “sites” are insances of the same controller SiteController, which are instanciated and registered dynamically.

The instances are registered as beans with the method registerSingleton(String name, Object bean)
of a ConfigurableListableBeanFactory that can be accessed through the provided ConfigurableApplicationContext

The array of strings represents the accessed configuration properties in the simplified example.
The array will most probably hold more complex data-structures in a real-world application.

But how do we get access to the configuration-parameters, that are injected in this array here…?

Accessing the Configured Property-Sources

Instantiating and registering the additionally beans is easy.
The real problem is to access the configuration properties in the early plumbing-stage of the application-context, in that our ApplicationContextInitializer runs in:

The initializer cannot be instantiated and autowired by Spring!

The Bad News: In the early stage we are running in, we cannot use autowiring or access any of the other beans that will be instantiated by spring – especially not any of the beans, that are instantiated via @ConfigurationProperties, we are intrested in.

The Good News: We will present a way, how to access initialized instances of all property sources, that will be presented to your app

Write an EnvironmentPostProcessor

If you write an EnvironmentPostProcessor, you will get access to an instance of ConfigurableEnvironment, that contains a complete list of all PropertySource‘s, that are configured for your Spring-Boot-App.

public class MultipleBeansEnvironmentPostProcessor
    implements
      EnvironmentPostProcessor
{
  @Override
  public void postProcessEnvironment(
      ConfigurableEnvironment environment,
      SpringApplication application)
  {
    String sites =
        environment.getRequiredProperty("juplo.sites", String.class);

    application.addInitializers(
        new MultipleBeansApplicationContextInitializer(
            Arrays
                .stream(sites.split(","))
                .map(site -> site.trim())
                .toArray(size -> new String[size])));
  }
}

The Bad News:
Unfortunately, you have to scan all property-sources for the parameters, that you are interested in.
Also, all values are represented as stings in this early startup-phase of the application-context, because Spring’s convenient conversion mechanisms are not available yet.
So, you have to convert any values by yourself and stuff them in more complex data-structures as needed.

The Good News:
The property names are consistently represented in standard Java-Properties-Notation, regardless of the actual type (.properties / .yml) of the property source.

Register the EnvironmentPostProcessor

Finally, you have to register the EnvironmentPostProcessor with your Spring-Boot-App.
This is done in the META-INF/spring.factories:

org.springframework.boot.env.EnvironmentPostProcessor=\
  de.juplo.demos.multiplebeans.MultipleBeansEnvironmentPostProcessor

That’s it, your done!

Source Code

You can find the whole source code in a working mini-application on juplo.de and GitHub:

Other Blog-Posts On The Topic

  • The blog-post Dynamic Beans in Spring shows a way to register beans dynamically, but does not show how to access the configuration. Also, meanwhile another interface was added to spring, that facilitates this approach: BeanDefinitionRegistryPostProcessor
  • Benjamin shows in How To Create Your Own Dynamic Bean Definitions In Spring, how this interface can be applied and how one can access the configuration. But his example only works with plain Spring in a Servlet Container

Deduplicating Partitioned Data With a Kafka Streams ValueTransformer

Inspired by a current customer project and this article about
deduplicating events with Kafka Streams
I want to share a simple but powerful implementation of a deduplication mechanism, that works well for partitioned data and does not suffer of memory leaks, because a countless number of message-keys has to be stored.

Yet, the presented approach does not work for all use-cases, because it presumes, that a strictly monotonically increasing sequence numbering can be established across all messages – at least concerning all messages, that are routed to the same partition.

The Problem

A source produces messages, with reliably unique ID’s.
From time to time, sending these messages to Kafka may fail.
The order, in which these messages are send, is crucial with respect to the incedent, they belong to.
Resending the messages in correct order after a failure (or downtime) is no problem.
But some of the messages may be send twice (or more often), because the producer does not know exactly, which messages were send successful.

Incident A - { id: 1,  data: "ab583cc8f8" }
Incident B - { id: 2,  data: "83ccc8f8f8" }
Incident C - { id: 3,  data: "115tab5b58" }
Incident C - { id: 4,  data: "83caac564b" }
Incident B - { id: 5,  data: "a583ccc8f8" }
Incident A - { id: 6,  data: "8f8bc8f890" }
Incident A - { id: 7,  data: "07583ab583" }

<< DOWNTIME OR FAILURE >>

Incident C - { id: 4,  data: "83caac564b" }
Incident B - { id: 5,  data: "a583ccc8f8" }
Incident A - { id: 6,  data: "8f8bc8f890" }
Incident A - { id: 7,  data: "07583ab583" }
Incident A - { id: 8,  data: "930fce58f3" }
Incident B - { id: 9,  data: "7583ab93ab" }
Incident C - { id: 10, data: "7583aab583" }
Incident B - { id: 11, data: "b583075830" }

Since eache message has a unique ID, all messages are inherently idempotent:
Deduplication is no problem, if the receiver keeps track of the messages, he has already seen.

Where is the problem?, you may ask. That’s trivial, I just code the deduplication into my consumer!

But this approach has several drawbacks, including:

  • Implementing the trivial algorithm described above is not efficent, since the algorithm in general has to remember the IDs of all messages for an indefinit period of time.
  • Implementing the algorithm over and over again for every consumer is cumbersome and errorprone.

Wouldn’t it be much nicer, if we had an efficient and bulletproof algorithm, that we can simply plug into our Kafka-pipelines?

The Idea

In his blog-article
Jaroslaw Kijanowski describes three deduplication algorithms.
The first does not scale well, because it does only work for single-partition topics.
The third aims at a slightly different problem and might fail deduplicating some messages, if the timing is not tuned correctly.
The looks like a robust solution.
But it also looks a bit hacky and is unnecessary complex in my opinion.

Playing around with his ideas, i have come up with the following algorithm, that combines elements of all three solutions:

  • All messages are keyed by an ID that represents the incident – not the message.
    This guarantees, that all messages concerning a specific incident will be stored in the same partition, so that their ordering is retained.
  • We generate unique strictly monotonically increasing sequence numbers, that are assigned to each message.
    If the IDs of the messages fullfill these requirements and are stored in the value (like above), they can be reused as sequence numbers
  • We keep track of the sequence number last seen for each partition.
  • We drop all messages with sequnce numbers, that are not greater than the last sequence number, that we saw on that partition.

The algorithm uses the well known approach, that TCP/IP uses to detect and drop duplicate packages.
It is efficient, since we never have to store more sequence numbers, than partitions, that we are handling.
The algorithm can be implemented easily based on a ValueTransformer, because Kafka Streams provides the ability to store state locally.

A simplified example-implementation

To clearify the idea, I further simplified the problem for the example implementation:

  • Key and value of the messages are of type String, for easy scripting.
  • In the example implementation, person-names take the part of the ID of the incident, that acts out as message-key.
  • The value of the message solely consists of the sequence number.

    In a real-world use-case, the sequence number would be stored in the message-value and would have to be extracted from there.
    Or it would be stored as a message-header.

That is, our message stream is simply a mapping from names to unique sequence numbers and we want to be able to separate out the contained sequence for a single person, without duplicate entries and without jeopardizing the order of that sequence.

In this simplified setup, the implementation effectively boils down to the following method-override:

@Override
public Iterable<String> transform(String value)
{
  Integer partition = context.partition();
  long sequenceNumber = Long.parseLong(value);

  Long seen = store.get(partition);
  if (seen == null || seen < sequenceNumber)
  {
    store.put(partition, sequenceNumber);
    return Arrays.asList(value);
  }

  return Collections.emptyList();
}

  • We can get the active partition from the ProcessorContext, that is handed to our Instance in the constructor, which is not shown here for brevity.
  • Parsing the String-value of the message as long corresponds to the extraction of the sequence number from the value of the message in our simplified setup.
  • We then check the local state, if a sequence-number was already seen for the active partition.

    Kafka Streams takes care of the initialization and resurection of the local state.
    Take a look at the full source-code see, how we instruct Kafka Streams to do so.
  • If this is the first sequence-number, that we see for this partition, or if the sequence-number is greater (that is: newer) than the stored one, we store it in our local state and return the value of the message, because it was seen for the first time.
  • Otherwise, we instruct Kafka Streams to drop the current (duplicate!) value, by returning an empty array.

We can use our ValueTransformer with flatTransformValues(),
to let Kafka Streams drop the detected duplicate values:

streamsBuilder
    .stream("input")
    .flatTransformValues(
        new ValueTransformerSupplier()
        {
          @Override
          public ValueTransformer get()
          {
            return new DeduplicationTransformer();
          }
        },
        "SequenceNumbers")
    .to("output");

One has to register an appropriate store to the StreamsBuilder under the referenced name.

The full source is available on github.com

Recapping Our Assumptions…

The presented deduplication algorithm presumes some assumptions, that may not fit your use-case.
It is crucial, that these prerequisites are not violated.
Therefor, I will spell them out once more:

  1. We can generate unique strictly monotonically increasing sequence numbers for all messages (of a partition).
  2. We have a strict ordering of all messages (per partition).
  3. And hence, since we want to handle more than one partition:
    The data is partitioned by key.
    That is, all messages for a specific key must always be routed to the same partition.

As a conclusion of this assumptions, we have to note:
We can only deduplicate messages, that are routed to the same partition.
This follows, because we can only guarantee message-order per partition. But it should not be a problem for the same reason:
We assume a use-case, where all messages concerning a specific incident are captured in the same partition.

What is not needed – but also does not hurt

Since we are only deduplicating messages, that are routed to the same partition, we do not need globally unique sequence numbers.
Our sequence numbers only have to be unique per partition, to enable us to detect, that we have seen a specific message before on that partition.
Golbally unique sequence numbers clearly are a stronger condition:
It does not hurt, if the sequence numbers are globally unique, because they are always unique per partition, if they are also globally unique.

We detect unseen messages, by the fact that their sequence number is greater than the last stored hight watermark for the partition, they are routed to.
Hence, we do not rely on a seamless numbering without gaps.
It does not hurt, if the series of sequence numbers does not have any gaps, as long as two different messages on the same partition never are assigned to the same sequence number.

That said, it should be clear, that a globally unique seamless numbering of all messages across all partitions – as in our simple example-implementation – does fit well with our approach, because the numbering is still unique, if one only considers the messages in one partition, and the gaps in the numbering, that are introduced by focusing only on the messages of a single partition, are not violating our assumptions.

Pointless / Contradictorily Usage Of The Presented Approach

Last but not least, I want to point out, that this approach silently assumes, that the sequence number of the message is not identically to the key of the message.
On the contrary: The sequence number is expected to be different from the key of the message!

If one would use the key of the message as its sequence number (provided that it is unique and represents a strictly increasing sequence of numbers), one would indeed assure, that all duplicates can be detected, but he would at once force the implementation to be indifferent, concerning the order of the messages.

That is, because subsequent messages are forced to have different keys, because all messages are required to have unique sequence numbers.
But messages with different keys may be routed to different partitions – and Kafka can only guarantee message ordering for messages, that live on the same partition.
Hence, one has to assume, that the order in which the messages are send is not retained, if he uses the message-keys as sequence numbers – unless, only one partition is utilized, which is contradictory to our primary goal here: enabling scalability through data-sharding.

This is also true, if the key of a message contains an invariant ID and only embeds the changing sequence number.
Because, the default partitioning algorithm always considers the key as a whole, and if any part of it changes, the outcome of the algorithm might change.

In a production-ready implementation of the presented approach, I would advice, to store the sequence number in a message header, or provide a configurable extractor, that can derive the sequence number from the contents of the value of the message.
It would be perfectly o.k., if the IDs of the messages are used as sequence numbers, as long as they are unique and monotonically increasing and are stored in the value of the message – not in / as the key!

Testing Exception-Handling in Spring-MVC

Specifying Exception-Handlers for Controllers in Spring MVC

Spring offers the annotation @ExceptionHandler to handle exceptions thrown by controllers.
The annotation can be added to methods of a specific controller, or to methods of a @Component-class, that is itself annotated with @ControllerAdvice.
The latter defines global exception-handling, that will be carried out by the DispaterServlet for all controllers.
The former specifies exception-handlers for a single controller-class.

This mechanism is documented in the Springframework Documentation and it is neatly summarized in the blog-article
Exception Handling in Spring MVC.
In this article, we will focus on testing the sepcified exception-handlers.

Testing Exception-Handlers with the @WebMvcTest-Slice

Spring-Boot offers the annotation @WebMvcTest for tests of the controller-layer of your application.
For a test annotated with @WebMvcTest, Spring-Boot will:

  • Auto-configure Spring MVC, Jackson, Gson, Message converters etc.
  • Load relevant components (@Controller, @RestController, @JsonComponent etc.)
  • Configure MockMVC

All other beans configured in the app will be ignored.
Hence, a @WebMvcTest fits perfectly for testing exception-handlers, which are part of the controller-layer.
It enables us, to mock away the other layers of the application and concentrate on the part, that we want to test.

Consider the following controller, that defines a request-handling and an accompanying exception-handler, for an
IllegalArgumentException, that may by thrown in the business-logic:

@Controller
public class ExampleController
{
  @Autowired
  ExampleService service;

  @RequestMapping("/")
  public String controller(
      @RequestParam(required = false) Integer answer,
      Model model)
  {
    Boolean outcome = answer == null ? null : service.checkAnswer(answer);
    model.addAttribute("answer", answer);
    model.addAttribute("outcome", outcome);
   return "view";
  }

  @ResponseStatus(HttpStatus.BAD_REQUEST)
  @ExceptionHandler(IllegalArgumentException.class)
  public ModelAndView illegalArgumentException(IllegalArgumentException e)
  {
    LOG.error("{}: {}", HttpStatus.BAD_REQUEST, e.getMessage());
    ModelAndView mav = new ModelAndView("400");
    mav.addObject("exception", e);
    return mav;
  }
}

The exception-handler resolves the exception as 400: Bad Request and renders the specialized error-view 400.

With the help of @WebMvcTest, we can easily mock away the actual implementation of the business-logic and concentrate on the code under test:
our specialized exception-handler.

@WebMvcTest(ExampleController.class)
class ExceptionHandlingApplicationTests
{
  @MockBean  ExampleService service;
  @Autowired MockMvc mvc;

  @Test
  @Autowired
  void test400ForExceptionInBusinessLogic() throws Exception {
    when(service.checkAnswer(anyInt())).thenThrow(new IllegalArgumentException("FOO!"));

    mvc
      .perform(get(URI.create("http://FOO/?answer=1234")))
      .andExpect(status().isBadRequest());

    verify(service, times(1)).checkAnswer(anyInt());
  }
}

We preform a GET with the help of the provided MockMvc and check, that the status of the response fullfills our expectations, if we tell our mocked business-logic to throw the IllegalArgumentException, that is resolved by our exception-handler.

How To Redirect To Spring Security OAuth2 Behind a Gateway/Proxy — Part 2: Hiding The App Behind A Reverse-Proxy (Aka Gateway)

This post is part of a series of Mini-Howtos, that gather some help, to get you started, when switching from localhost to production with SSL and a reverse-proxy (aka gateway) in front of your app, that forwards the requests to your app that listens on a different name/IP, port and protocol.

In This Series We…

  1. Run the official Spring-Boot-OAuth2-Tutorial as a container in docker
  2. Simulate production by hiding the app behind a gateway (this part)
  3. Show how to debug the oauth2-flow for the whole crap!
  4. Enable SSL on our gateway
  5. Show how to do the same with Facebook, instead of GitHub

I will also give some advice for those of you, who are new to Docker – but just enough to enable you to follow.

This is part 2 of this series, that shows how to run a Spring-Boot OAuth2 App behind a gateway
Part 1 is linked above.

Our Plan: Simulating A Production-Setup

We will simulate a production-setup by adding the domain, that will be used in production – example.com in our case -, as an alias for localhost.

Additionally, we will start an NGINX as reverse-proxy alongside our app and put both containers into a virtual network.
This simulates a real-world secenario, where your app will be running behinde a gateway together with a bunch of other apps and will have to deal with forwarded requests.

Together, this enables you to test the production-setup of your oauth2-provider against a locally running development environment, including the configuration of the finally used URIs and nasty forwarding-errors.

To reach this goal we will have to:

  1. Reconfigure our oauth-provider for the new domain
  2. Add the domain as an alias for localhost
  3. Create a virtual network
  4. Move the app into the created virtual network
  5. Configure and start nginx as gateway in the virtual network

By the way:
Any other server, that can act as reverse proxy, or some real gateway,like Zuul would work as well, but we stick with good old NGINX, to keep it simple.

Switching The Setup Of Your OAuth2-Provider To Production

In our example we are using GitHub as oauth2-provider and example.com as the domain, where the app should be found after the release.
So, we will have to change the Authorization callback URL to
http://example.de/login/oauth2/code/github

O.k., that’s done.

But we haven’t released yet and nothing can be found on the reals server, that hosts example.com
But still, we really would like to test that production-setup to be sure that we configured all bits and pieces correctly!


In order to tackle this chicken-egg-problem, we will fool our locally running browser to belive, that example.com is our local development system.

Setting Up The Alias for example.com

On Linux/Unix this can be simply done by editing /etc/hosts.
You just have to add the domain (example.com) at the end of the line that starts with 127.0.0.1:

127.0.0.1	localhost example.com

Locally running programms – like your browser – will now resolve example.com as 127.0.0.1

Create A Virtual Network With Docker

Next, we have to create a virtual network, where we can put in both containers:

docker network create juplo

Yes, with Docker it is as simple as that.

Docker networks also come with some extra goodies.
Especially one, which is extremly handy for our use-case is: They are enabling automatic name-resolving for the connected containers.
Because of that, we do not need to know the IP-addresses of the participating containers, if we give each connected container a name.

Docker vs. Kubernetes vs. Docker-Compose

We are using Docker here on purpose.
Using Kubernetes just to test / experiment on a DevOp-box would be overkill.
Using Docker-Compose might be an option.
But we want to keep it as simple as possible for now, hence we stick with Docker.
Also, we are just experimenting here.


You might want to switch to Docker-Compose later.
Especially, if you plan to set up an environment, that you will frequently reuse for manual tests or such.

Move The App Into The Virtual Network

To move our app into the virtual network, we have to start it again with the additional parameter --network.
We also want to give it a name this time, by using --name, to be able to contact it by name.


You have to stop and remove the old container from part 1 of this HowTo-series with CTRL-C beforehand, if it is still running – Removing is done automatically, because we specified --rm
:

docker run \
  -d \
  --name app \
  --rm \
  --network juplo \
  juplo/social-logout:0.0.1 \
  --server.use-forward-headers=true \
  --spring.security.oauth2.client.registration.github.client-id=YOUR_GITHUB_ID \
  --spring.security.oauth2.client.registration.github.client-secret=YOUR_GITHUB_SECRET

Summary of the changes in comparison to the statement used in part 1:

  • We added -d to run the container in the background – See tips below…
  • We added --server.use-forward-headers=true, which is needed, because our app is running behind a gateway now – I will explain this in more detail later

  • And: Do not forget the --network juplo,
    which is necessary to put the app in our virtual network juplo, and --name app, which is necessary to enable DNS-resolving.
  • You do not need the port-mapping this time, because we will only talk to our app through the gateway.
    Remember: We are hiding our app behind the gateway!

Some quick tips to Docker-newbies

  • Since we are starting multiple containers, that shall run in parallel, you have to start each command in a separate terminal, because CTRL-C will stop (and in our case remove) the container again.
  • Alternatively, you can add the parameter -d (for daemonize) to start the container in the background.

  • Then, you can look at its output with docker logs -f NAME (safely disruptable with CTRL-C) and stop (and in our case remove) the container with docker stop NAME.
  • If you wonder, which containers are actually running, docker ps is your friend.

Starting the Reverse-Proxy Aka Gateway

Next, we will start NGINX alongside our app and configure it as reverse-proxy:

  1. Create a file proxy.conf with the following content:

    upstream upstream_a {
      server        app:8080;
    }
    
    server {
      listen        80;
      server_name   example.com;
    
      proxy_set_header     X-Real-IP           $remote_addr;
      proxy_set_header     X-Forwarded-For     $proxy_add_x_forwarded_for;
      proxy_set_header     X-Forwarded-Proto   $scheme;
      proxy_set_header     Host                $host;
      proxy_set_header     X-Forwarded-Host    $host;
      proxy_set_header     X-Forwarded-Port    $server_port;
    
      location / {
        proxy_pass  http://upstream_a;
      }
    }
    
    • We define a server, that listens to requests for the host example.com (server_name) on port 80.
    • With the location-directive we tell this server, that all requests shall be handled by the upstream-server upstream_a.
    • This server was defined in the upstream-block at the beginning of the configuration-file to be a forward to app:8080
    • app is simply the name of the container, that is running our oauth2-app – Rembember: the name is resolvable via DNS
    • 8080 is the port, our app listens on in that container.
    • The proxy_set_header-directives are needed by Spring-Boot Security, for dealing correctly with the circumstance, that it is running behind a reverse-proxy.

    In part 3, we will survey the proxy_set_header-directives in more detail.

  2. Start nginx in the virtual network and connect port 80 to localhost:

    docker run \
      --name proxy \
      --rm \
      --network juplo -p 80:80 \
      --volume $(pwd)/proxy.conf:/etc/nginx/conf.d/proxy.conf:ro \
      nginx:1.17
    

    This command has to be executed in the direcotry, where you have created the file proxy.conf.

    • I use NGINX here, because I want to demystify the work of a gateway
      traefik would have been easier to configure in this setup, but it would have disguised, what is going on behind the scene: with NGINX we have to configure all manually, which is more explicitly and hence, more informative
    • We can use port 80 on localhost, since the docker-daemon runs with root-privileges and hence, can use this privileged port – if you do not have another webserver running locally there.
    • $(pwd) resolves to your current working-directory – This is the most convenient way to produce the absolute path to proxy.conf, that is required by --volume to work correclty.
  3. If you have reproduced the receipt exacly, your app should be up and running now.
    That is:

    • Because we set the alias example.com to point at localhost you should now be able to open your app as http://example.com in a locally running browser
    • You then should be able to login/logount without errors
    • If you have configured everything correctly, neither your app nor GitHub should mutter at you during the redirect to GitHub and back to your app

    Whats next… is what can go wrong!

    In this simulated production-setup a lot of stuff can go wrong!
    You may face nearly any problem from configuration-mismatches considering the redirect-URIs to nasty and hidden redirect-issues due to forwarded requests.


    Do not mutter at me…
    Remember: That was the reason, we set up this simulated production-setup in the first place!

    In the next part of this series I will explain some of the most common problems in a production-setup with forwarded requests.
    I will also show, how you can debug the oauth2-flow in your simulated production-setup, to discover and solve these problems

How To Redirect To Spring Security OAuth2 Behind a Gateway/Proxy – Part 1: Running Your App In Docker

Switching From Tutorial-Mode (aka POC) To Production Is Hard

Developing Your first OAuth2-App on localhost with OAuth2 Boot may be easy, …

…but what about running it in real life?

Looking for the real life

This is the first post of a series of Mini-Howtos, that gather some help, to get you started, when switching from localhost to production with SSL and a reverse-proxy (aka gateway) in front of your app, that forwards the requests to your app that listens on a different name/IP, port and protocol.

In This Series We Will…

  1. Start with the fantastic official OAuth2-Tutorial from the Spring-Boot folks – love it! – and run it as a container in docker
  2. Hide that behind a reverse-proxy, like in production – nginx in our case, but could be any pice of software, that can act as a gateway
  3. Show how to debug the oauth2-flow for the whole crap!
  4. Enable SSL for our gateway – because oauth2-providers (like Facebook) are pressing us to do so
  5. Show how to do the same with Facebook, instead of GitHub

I will also give some advice for those of you, who are new to Docker – but just enough to enable you to follow.

This is Part 1 of this series, that shows how to package a Spring-Boot-App as Docker-Image and run it as a container

tut-spring-boot-oauth2/logout

As an example for a simple app, that uses OAuth2 for authentication, we will use the third step of the Spring-Boot OAuth2-Tutorial.

You should work through that tutorial up until that step – called logout -, if you have not done yet.
This will guide you through programming and setting up a simple app, that uses the GitHub-API to authenticate its users.

Especially, it explains, how to create and set up a OAuth2-App on GitHubDo not miss out on that part: You need your own app-ID and -secret and a correctly configured redirect URI.

You should be able to build the app as JAR and start that with the ID/secret of your GitHub-App without changing code or configuration-files as follows:

mvn package
java -jar target/social-logout-0.0.1-SNAPSHOT.jar \
  --spring.security.oauth2.client.registration.github.client-id=YOUR_GITHUB_APP_ID
  --spring.security.oauth2.client.registration.github.client-secret=YOUR_GITHUB_APP_SECRET

If the app is running corectly, you should be able to Login/Logout via http://localhost:8080/

The folks at Spring-Boot are keeping the guide and this repository up-to-date pretty well.
At the date of the writing of this article it is up to date with version 2.2.2.RELEASE of Spring-Boot.

You may as well use any other OAuth2-application here. For example your own POC, if you already have build one that works while running on localhost

Some Short Notes On OAuth2

I will only explain the protocol in very short words here, so that you can understand what goes wrong in case you stumble across one of the many pitfalls, when setting up oauth2.
You can read more about oauth2 elswhere

For authentication, oauth2 redirects the browser of your user to a server of your oauth2-provider.
This server authenticates the user and redirects the browser back to your server, providing additionally information and ressources, that lets your server know that the user was authenticated successfully and enables it to request more information in the name of the user.

Hence, when configuring oath2 one have to:

  1. Provide the URI of the server of your oauth2-provider, the browser will be redirected to for authentication
  2. Tell the server of the oauth2-provider the URL, the browser will be redirected to back after authentication
  3. Also, your app has to provide some identification – a client-ID and -secret – when redirecting to the server of your oauth2-provider, which it has to know

There are a lot more things, which can be configured in oauth2, because the protocol is designed to fit a wide range of use-cases.
But in our case, it usually boils down to the parameters mentioned above.

Considering our combination of spring-security-oauth2 with GitHub this means:

  1. The redirect-URIs of well known oauth2-providers like GitHub are build into the library and do not have to be configured explicitly.
  2. The URI, the provider has to redirect the browser back to after authenticating the user, is predefined by the library as well.

    But as an additional security measure, almost every oauth2-provider requires you, to also specify this redirect-URI in the configuration on the side of the oauth2-provider.

    This is a good and necessary protection against fraud, but at the same time the primary source for missconfiguration:
    If the specified URI in the configuration of your app and on the server of your oauth2-provider does not match, ALL WILL FAIL!
  3. The ID and secret of the client (your GitHub-app) always have to be specified explicitly by hand.

Again, everything can be manually overriden, if needed.
Configuration-keys starting with spring.security.oauth2.client.registration.github are choosing GitHub as the oauth2-provider and trigger a bunch of predifined default-configuration.
If you have set up your own oauth2-provider, you have to configure everything manually.

Running The App Inside Docker

To faciliate the debugging – and because this most probably will be the way you are deploying your app anyway – we will start by building a docker-image from the app

For this, you do not have to change a single character in the example project – all adjustments to the configuration will be done, when the image is started as a container.
Just change to the subdirectory logout of the checked out project and create the following Dockerfile there:

FROM openjdk:8-jre-buster

COPY  target/social-logout-0.0.1-SNAPSHOT.jar /opt/app.jar
EXPOSE 8080
ENTRYPOINT [ "/usr/local/openjdk-8/bin/java", "-jar", "/opt/app.jar" ]
CMD []

This defines a docker-image, that will run the app.

  • The image deduces from openjdk:8-jre-buster, which is an installation of the latest OpenJDK-JDK8 on a Debian-Buster
  • The app will listen on port 8080
  • By default, a container instanciated from this image will automatically start the Java-app
  • The CMD [] overwrites the default from the parent-image with an empty list – this enables us to pass command-line parameters to our spring-boot app which we will need to pass in our configuration

You can build and tag this image with the following commands:

mvn clean package
docker build -t juplo/social-logout:0.0.1 .

This will tag your image as juplo/social-logout:0.0.1 – you obviously will/should use your own tag here, for example: myfancytag

Do not miss out on the flyspeck (.) at the end of the last line!

You can run this new image with the follwing command – and you should do that, to test that everything works as expected:

docker run \
  --rm \
  -p 8080:8080 \
  juplo/social-logout:0.0.1 \
  --spring.security.oauth2.client.registration.github.client-id=YOUR_GITHUB_ID \
  --spring.security.oauth2.client.registration.github.client-secret=YOUR_GITHUB_SECRET
  • --rm removes this test-container automatically, once it is stopped again
  • -p 8080:8080 redirects port 8080 on localhost to the app

Everything after the specification of the image (here: juplo/social-logout:0.0.1) is handed as a command-line parameter to the started Spring-Boot app – That is, why we needed to declare CMD [] in our Dockerfile

We utilize this here to pass the ID and secret of your GitHub-app into the docker container — just like when we started the JAR directly

The app should behave exactly the same now lik in the test above, where we started it directly by calling the JAR.

That means, that you should still be able to login into and logout of your app, if you browse to http://localhost:8080
At least, if you correctly configured http://localhost:8080/login/oauth2/code/github as authorization callback URL in the settings of your OAuth App on GitHub.

Comming Next…

In the next part of this series, we will hide the app behind a proxy and simulate that the setup is running on our real server example.com.

Actuator HTTP Trace Does Not Work With Spring Boot 2.2.x

TL;DR

In Spring Boot 2.2.x, you have to instanciate a @Bean of type InMemoryHttpTraceRepository to enable the HTTP Trace Actuator.

Jump to the explanation of and example code for the fix

Enabling HTTP Trace — Before 2.2.x...

Spring Boot comes with a very handy feature called Actuator.
Actuator provides a build-in production-ready REST-API, that can be used to monitor / menage / debug your bootified App.
To enable it — prior to 2.2.x —, one only had to:

  1. Specifiy the dependency for Spring Boot Actuator:

    <dependency>
      <groupId>org.springframework.boot
      <artifactId>spring-boot-starter-actuator
    </dependency>
    
    
  2. Expose the needed endpoints via HTTP:

    management.endpoints.web.exposure.include=*
    
    • This exposes all available endpoints via HTTP.
    • Advise: Do not copy this into a production config
      (Without thinking about it twice and — at least — enable some security measures to protect the exposed endpoints!)

The problem: It simply does not work any more in 2.2 :(

But…

  • If you upgrade your existing app with a working httptrace-actuator to Spring Boot 2.2.x, or
  • If you start with a fresh app in Spring Boot 2.2.x and try to enable the httptrace-actuator as described in the documentation

…it simply does not work at all!

The Fix

The simple fix for this problem is, to add a @Bean of type InMemoryHttpTraceRepository to your @Configuration-class:

@Bean
public HttpTraceRepository htttpTraceRepository()
{
  return new InMemoryHttpTraceRepository();
}

The Explanation

The cause of this problem is not a bug, but a legitimate change in the default configuration.
Unfortunately, this change is not noted in the according section of the documentation.
Instead it is burried in the Upgrade Notes for Spring Boot 2.2

The default-implementation stores the captured data in memory.
Hence, it consumes much memory, without the user knowing, or even worse: needing it.
This is especially undesirable in cluster environments, where memory is a precious good.
And remember: Spring Boot was invented to simplify cluster deployments!

That is, why this feature is now turned of by default and has to be turned on by the user explicitly, if needed.

Fix Hot Reload of Thymeleaf-Templates In spring-boot:run

The Problem: Hot-Reload Of Thymeleaf-Templates Does Not Work, When The Application Is Run With spring-boot:run

A lot of people seem to have problems with hot reloading of static HTML-ressources when developing a Spring-Boot application that uses Thymeleaf as templateing engine with spring-boot:run.
There are a lot of tips out there, how to fix that problem:

  • The official Hot-Swapping-Guide says, that you just have to add spring.thymeleaf.cache=false in your application-configuration in src/main/resources/application.properties.
  • Some say, that you have to disable caching by setting spring.template.cache=false and spring.thymeleaf.cache=false and/or run the application in debugging mode.
  • Others say, that you have to add a dependency to org.springframework:springloaded to the configuration of the spring-boot-maven-plugin.
  • There is even a bug-report on GitHub, that says, that you have to run the application from your favored IDE.

But none of that fixes worked for me.
Some may work, if I would switch my IDE (I am using Netbeans), but I have not tested that, because I am not willing to switch my beloved IDE because of that issue.

The Solution: Move Your Thymeleaf-Templates Back To src/main/webapp

Fortunatly, I found a simple solution, to fix the issue without all the above stuff.
You simply have to move your Thymeleaf-Templates back to where they belong (IMHO): src/main/webapp and turn of the caching.
It is not necessary to run the application in debugging mode and/or from your IDE, nor is it necessary to add the dependency to springloaded or more configuration-switches.

To move the templates and disable caching, just add the following to your application configuration in src/main/application.properties:

spring.thymeleaf.prefix=/thymeleaf/
spring.thymeleaf.cache=false

Of course, you also have to move your Thymeaf-Templates from src/main/resources/templates/ to src/main/webapp/thymeleaf/.
In my opinion, the templates belong there anyway, in order to have them accessible as normal static HTML(5)-files.
If they are locked away in the classpath you cannot access them, which foils the approach of Thymeleaf, that you can view your templates in a browser as thy are.

Funded by the Europian Union

This article was published in the course of a
resarch-project,
that is funded by the European Union and the federal state Northrhine-Wetphalia.


Europäische Union: Investitionen in unsere Zukunft - Europäischer Fonds für regionale Entwicklung
EFRE.NRW 2014-2020: Invesitionen in Wachstum und Beschäftigung