CS636  Class 22 Servlets, Multithreading, Debugging

Last time: how the pizza3 servlets work.

We looked at StudentBean and how it is created for a user in StudentWelcomeController and made into a session variable, and how the DispatherServlet made sure that an incoming request is forwarded to the StudentWelcomeController if the session variable is not there.

Threads and Concurrency

Consider request-response cycles for various users:

We are assuming that each user causes a sequence of request cycles usually well separated in time. Concurrent requests come about because of different user’s requests happening to occur at the same time, as follows:


Timeline (one line segment here is mistaken) showing requests for one user over time-

 

Session object                          continuing existence of session object, until 20 minutes of idleness by user-->

Another user, another such sequence of requests, and another session object...

Rough Performance Analysis




Suppose 10000 session objects, for 10000 users in last 20 minutes

1000 active sessions, each lasting 2 minutes (so 10000 in 20 min)

Each user session does 10 requests over the 2 minutes, so 10000 requests in 2 min = 120 sec

So about 100 requests per sec.

Suppose each request takes 50 ms and uses 10% of a CPU during this time

That’s 5ms of CPU time per request, so 100 req/sec * 5ms = 500 ms CPU per sec, means 50% of one CPU to handle this load, OK.

We said 50ms elapsed time per request.  There are 20 such periods in each second, in which 100 requests are positioned, so about 5 requests in each 50ms period.

So we see about 5 concurrent requests occurring at each point in time with this load. Most requests have transactions, so also about 5 concurrent transactions, each with its em.

That means we are involved in multithreaded execution. But all the changeable shared domain data is held in the database, so we are using the database to do the hard work of concurrency control to shared data.  That’s one of the main secrets of good web application design.

Scaling up

First use a decent multiprocessor for the app server, and another one for the DB server.

Will find that the app server gets overloaded before the DB server (assuming small DB requests of course).

So scale up the app server first.  Eventually one system can't handle the load.

So replicate the app servers up to 25 ways (requires a request router up front.) Get a 40-way (or more) multiprocessor for the DB, with lots of memory. Note that a DB caches hot data in its database cache, for fast access, so get enough memory for all important database data.

This config is fully supported by our software setup, because of the independence of all the request environments.

Only when the biggest system can’t run the needed DB that you replicate the DB, unless the work can be perfectly partitioned in a way the request router can understand. 

Luckily, when your site is this big, you should have enough money to hire real experts!

Multi-threading: Let the DB handle concurrent access to domain data                                                  

Five concurrent transactions, each with its own thread, means we’re definitely doing a multi-threaded application.  Need to worry about concurrent access to shared data.

But with our architecture, the database system takes care of concurrent access to shared data.  The only memory objects that receive concurrent access in our app are the API objects and Controller objects, which don’t change, and the emf, which is thread-safe (it has to be!). Also the servlet objects and their Controller objects, but they don't change either. We are assuming that a single user does not have concurrent requests, so we don’t have to worry about synchronizing access to the HTTPSession object.

A short em lifetime is good for performance: be wary of “stateful session beans”

There are ways to keep the em going for a user between transactions, using an “extended persistence context” and EJB “stateful session beans”. But note that would mean 10000 em’s above, a much larger memory use.  If an em needs 1MB (a guess), 10000 of them would be a 10 GB, a big load for a single system.  But our setup only needs 5 of them, one for each active transaction, a total of 5MB, no problem.

Web app initialization

How does a web app start up and create the needed service and DAO singletons in the web app case?

We know the servlets have an init() method that tomcat calls once when the servlet starts. Good place to call configureServices to get the system up and running.  But there are two important servlets--do they both call this? Better to ensure that only one does.

We could have the various servlets check if the service refs are available and if not, call configureServices to set them up.

Well, that would work (with a possible race condition causing duplicate « singletons »), but actually AdminServlet is config’d in web.xml to load after DispatcherServlet : See web.xml:

<servlet>

              <description>Admin Servlet</description>

              <display-name>AdminServlet</display-name>

              <servlet-name>AdminServlet</servlet-name>

              <servlet-class>cs636.pizza.presentation.web.AdminServlet</servlet-class>

              <!-- make this servlet load after the dispatcher servlet -->

              <load-on-startup>2</load-on-startup> 

</servlet>

Similarly DispatcherServlet has   <load-on-startup>1</load-on-startup>, so tomcat carefully calls init of DispatcherServlet first, and after that returns, calls init() of AdminServlet.

This sequence builds the service and DAO singleton objects with inter-references like this, as seen before in pizza1 and pizza2:



 

 

 

 

 

 

 

 

Service and DAO singletons in pizza1 through pizza3. For pizza3, they are inside the tomcat JVM.

This object graph lives on between user sessions and requests.  The singletons are themselves protected from garbage collection because of the static fields in PizzaSystemConfig.

We added the servlet objects and their Controllers to this picture. The servlet objects are not garbage collected because they are managed by tomcat itself, so it has references to them.

More on JPA handling in request

As covered last time, each request comes in and creates its own EM, held in ThreadLocal storage for that request only. 

The EM sets up a persistence context for the request, which houses the domain objects needed for this request.

Recall we called the domain objects “scratch copies” of database data.

They are also request-private, unless we specially setup sharing for the POJO by more annotations in its source an advanced topic, not covered in this course). There are no such annotations in pizza3, so PizzaOrder, Topping, and PizzaSize objects are all request-private copies.

Each request gets fresh copies of the DB data, so if an admin deletes a topping, that will be evident in following requests.

These private domain objects are important to the argument that we don’t have to worry about multithreading issues in our code. That’s a big feature!

Multithreading issues: we claim to be free of them (covered earlier this term)

We need the assumption that each session (i.e. requests from one user) involves one request at a time.

These come up because two threads act on the same object, i.e., the data is shared. So we have to look for objects shared between threads, i.e., requests.

--no domain objects obtained from DB, because each thread has its own EM, with its associated set of domain objects

--no fields of service objects by statelessness, and immutability of DAO references

--no session variables, by one-at-a-time request assumption

--we are not using “application” scope variables (this is possible, but leads to race conditions. We use the DB for shared data)

--we do share the emf and ThreadLocal objects, but they are thread-safe, that is, have an internal mutex to guard actions against race condition.

--we do share the service singletons, but they are immutable once set up. So are the Controller singletons.

--we do share the DAO singletons, but they only have thread-safe fields.

--the servlet object is private to the thread, created for the doGet/doPost calls

So we have an argument for all objects used in our programs. Right?

So we are doing multithreaded programming without a single mutex that we set up ourselves!

That’s a big win. Having multiple mutexes in code leads to the dreaded deadlock situation, when the software system freezes up, and needs true multithreaded debugging (Java has some support for this).

Debugging Servlets

Don’t forget to start your tunnels before starting tomcat, other work

Tomcat tries to connect to DBs on its way up, can fail if can’t connect

Bring down tomcat if your tunnels have stopped working, redo startup.

Logging using System.out.println

System.out.println output goes to the “server log”, but this is located in different places depending on circumstances of the run.

Example in servlet1: System.out.println("in doGet");

Can do with “ant test1”

If starting tomcat from the command line in Windows (startup.bat), output shows in window that showed up at tomcat start.

If starting tomcat from the command line in Linux/Mac (startup.sh), output shows in logs/catalina.out.

If tomcat started in eclipse, output shows in Console window, which may be multiplexed, esp. if you run “ant test1” in an ant window. You can switch between windows using a little pulldown menu on the Console tab.

Make sure you can find this System.out output!

We ran out of time at this point. More on this next time.

From Eclipse help, slightly edited to clarify:

Debugging a servlet running on the local system using eclipse (checked on Windows)

The debugger enables you to detect and diagnose errors in your application. It allows you to control the execution of your program by setting breakpoints, suspending threads, stepping through the code, and examining the contents of the variables. You can debug a servlet on a server without losing the state of your application.

To debug a servlet on a server:

Web attack by SQL Injection: What it is and how to avoid it

There is a possible flaw the admin app of music3/4 that should be considered

Login UI takes in username and password

Suppose DAO does  select  count(*) from userpass where username=’andrea‘and password=’sesame‘

Sounds OK, but is prone to “SQL injection” ploy--

                Adding on to app’s SQL by putting the right text in a user input field

My break in :

                Username   ‘ or ‘a’ = ‘a

                Password    ‘ or ‘a’ = ‘a

                Success login!

Made query into

 

  select  count(*) from userpass

     where username=’‘ or ‘a’ = ‘a‘and password=’‘ or ‘a’ = ‘a

(user input underlined) which counts every line in the table, resulting in a successful login.

Fixes:

or, another common approach:

In general be wary of using strings from users in SQL!