Sharing performance testing findings

Jason B. Rappaport jasonrap at princeton.edu
Fri Apr 5 09:26:08 EDT 2019


This week, we engaged with performance testing of our PeopleSoft system and
I wanted to share our key findings in the event it can help someone else.  I
am summarizing the findings below, so please grant me the liberty to be
slightly off in the description.

 

Background:

Our PeopleSoft system is used by faculty, staff, and students and is
configured to be accessed via web proxies.  The web proxies (2) are load
balanced (round robin, cookie based) and configured as Shibboleth SPs, which
defers authentication to our load balanced (round robin, cookie based)
Shibboleth IDPs (2), which then perform the actual authentication via CAS
(single node, for now) that uses AD.   

 

Findings:

During our performance testing, when we tried to authenticate 1,000 users at
5 per second and perform a series of tasks within PeopleSoft we would
encounter a high error rate that our PeopleSoft weblogic and app servers
could not recover from unless we restarted those services.  In digging into
where the errors were being generated, we uncovered the following:

1.	Our two Shibboleth IDPs were targeting the same DCs (in the same
order) for attribute resolution per the attribute-resolver.xml file
2.	Our CAS server was also targeting the same DC for authentication

 

In essence, the single DC was seeing traffic from three hosts, two for
attribute resolution (IDPs) and one for authentication (CAS) at the same
time resulting in CPU usage from 60% to 100% during our load testing.  

 

Actions taken:

1.	We changed the order within our Shibboleth IDPs
(attribute-resolver.xml) so the first DC they would hit for attribute
resolution would be different  
2.	We changed the DC that CAS was using for authentication to not be
either DC the Shibboleth IDPs were using for attribute resolution; i.e. a
3rd DC.  

 

Results:

1.	Each DC, when conducting the load test, now operates around 40% CPU
usage
2.	The 3rd DC, the one that CAS is using for authentication, is
operating around 15% CPU usage when conducing the load test

 

 

We were tipped off by this post:
http://shibboleth.1660669.n2.nabble.com/Shibboleth-idp-performance-question-
td7589479.html  that our IDPs could be a bottleneck if attribute resolution
and/or authentication was taking too long <-Thanks Scott!  

 

That being said, we still have more work to do as when we perform a load
test with 1,000 users at 10 per second we encounter a high error rate.  We
are hoping to determine how we can configure the system to not error out,
but to resolve gracefully.  

 

Again, hopefully our findings can help someone else with their
configuration.  

 

Thanks, Jay 

________________________________

Jason Rappaport

Identity and Access Management Analyst

Office of Information Technology

Email:   <mailto:jasonrap at princeton.edu> jasonrap at princeton.edu 

Office:  609-258-8464

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://shibboleth.net/pipermail/users/attachments/20190405/4bed8d8e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5653 bytes
Desc: not available
URL: <http://shibboleth.net/pipermail/users/attachments/20190405/4bed8d8e/attachment.p7s>


More information about the users mailing list