iSeries Performance Problems

 14 Replies
 0 Subscribed to this topic
 27 Subscribed to this forum
Sort:
Author
Messages
Walter
Advanced Member
Posts: 22
Advanced Member
    We've been having this issue for quite awhile.  I've open up cases with Lawson and it has come down to them suggesting that I should hire their Professional Services.  Before I do, I wanted to see if anyone has experienced this problem.  These are the symptoms.

    a) Our users run Lawson from Portal.  They're set up with dashboards, suited to their needs to allow them to run LBI reports.  When the performance goes bad, a user's home page takes a long time to load when they sign on or they do a refresh.  Here's the interesting point.  Suppose we have 2 users that are working in Portal.  User A clicks on the "globe" to refresh their home page.  The process hangs.  User B signs on and their page hangs,  But it seems to "push" User A's request and brings up their dashboards.  User A then presses the "globe", and they hang, but User B frees up.  
    b) Going to an online form and doing form actions don't seem to be affected.  A user can access a form like HR11 and perform updates without having any problems.
    c) If a user accesses a form for a batch job (for example, PR140), and does an inquire on a job that's been set up, performance is good.  However, if they make a change to one of the parameters, it becomes very slow.
    d) Trying to view reports in Print Manager is slow.
    e) Bouncing the environment resolves the issue,  until the next occurrance.  However, when bringing the envorment down with STOPLAW,  this process seems to hand.  I'm not 1005 sure of this, but when I end "JAVA" job running in the subsystem, it seems to free it up and bring down the environment. 

    Every night, we bring down our enviroments for back up, and restarted them after the backup completes.  This morning, we were experiencing these symptoms right off the bat.  And now it is getting worses as the day goes on.   So I'm wondering if there is something in the start up that every once in awhile, causes the problem.  (just a thought)

    Greg Moeller
    Veteran Member
    Posts: 1498
    Veteran Member
      Walter: There could be many reasons for this... when you say "They're set up with dashboards.." do you mean that they load LBI in their Portal home pages like we do here?

      If so, check your LBI Data Sources (WAS console, Resources, JDBC, Data Sources) -- check all 3 FS, RS, and SN -- click on each one, then on 'Connection Pool Properties' ** Set each of them to a Maximum connections setting of 200. (I think the doc recommends 10, 30, and 10 respectively.)
      Walter
      Advanced Member
      Posts: 22
      Advanced Member
        Hi Greg:  Thanks for you reply. 

        Yes, the users do load LBI in their Portal home page.

        I'll double check the data sources.  I know we had already looked at the Connection Pool Properties and adjusted them.  At one time, I suspected the WAS, but we get performance issues when trying to update a batch job's parameters in LID.   I've just bounced the environment, and now our performance is good.
        Greg Moeller
        Veteran Member
        Posts: 1498
        Veteran Member
          there is another file on the LSF environment server which has a max, or setting that we had to change in it as well, but I can't remember which one it was. I want to say lase something or ls something... sorry I can't remember.
          mikeD
          Basic Member
          Posts: 5
          Basic Member
            Did you ever resolve this problem? We are expieriencing simmilar problem. About once every 2-3 weeks, the LBI dashboard is terrible slow. If we bounce Lawson (and as400 websphere) it corrects the problem. We upgraded to WAS7 and 9.0.1 back in July 2012 and I think this is when problem started.
            Greg Moeller
            Veteran Member
            Posts: 1498
            Veteran Member
              I remember vaguely that we needed to adjust our WAS sessions allowed max too. Remember that when using DSSO (as I assume you are) that you not only need to reboot the LBI WAS, but the Lawson core WAS as well when making changes to either.
              Walter
              Advanced Member
              Posts: 22
              Advanced Member
                We're still experiencing the problem.  In fact, it's happening right now as I write this. So far, we've not being able to determine the cause. The iSeries does not seem to be stressed in any way.The only remedy we have is to bounce the environment.  And if I haven't mentioned this before, when I issue the STOPLAW command, the environment doesn't seem to want to come down until I manually kill the JAVA job;

                Work with Subsystem Jobs                   CAI501  
                                                                             12/11/12  09:17:38
                 Subsystem  . . . . . . . . . . :   LAW9                                       
                                                                                               
                 Type options, press Enter.                                                    
                   2=Change   3=Hold   4=End   5=Work with   6=Release   7=Display message     
                   8=Work with spooled files   13=Disconnect                                   
                                                                                               
                                                                                               
                 Opt  Job         User        Type     -----Status-----  Function              
                      CDRIVER     LAWSON      BATCHI   ACTIVE            PGM-CDRIVER           
                      CDRIVER     LAWSON      BATCHI   ACTIVE            PGM-CDRIVER           
                      JAVA        LAWSON      BATCHI   ACTIVE            JVM-SecuritySe        
                      LADB        LAWSON      BATCHI   ACTIVE            PGM-LADB              
                      LADEATH     LAWSON      BATCHI   ACTIVE            PGM-LADEATH           
                      LAJS        LAWSON      BATCHI   ACTIVE            PGM-LAJS              
                      LAPOOLER    LAWSON      BATCHI   ACTIVE            PGM-LAPOOLER          
                      LASE        LAWSON      BATCH    ACTIVE            PGM-LASE              
                                                                                        More...

                Walter
                Advanced Member
                Posts: 22
                Advanced Member
                  This problem also shows up when a user tries to change parameters on a batch job (i.e PR140) The inquiry function to pull up existing parameters work fine. But when the user hits CHANGE, it seems to hang until another request from some other session "pushes" it along. It's not limited to Portal. I try to change the job in LID, and the same thing happens.

                  Just to clarify;
                  I have LID SESSION open. I go to PR140 and do an inquire to pull up a job. Response is good. I press CHANGE and it hangs up. I go to a PORTAL session which I have open and do a refresh on my home page. The Portal Session hangs but the LID session frees up. I do another change in LID; PORTAL frees up but LID HANGS. I can go back and forth with the problem repeating over and over again.
                  Kwane McNeal
                  Veteran Member
                  Posts: 479
                  Veteran Member
                    Walter,
                    Ahh, my favorite: I affectionately call it the "Two Window Problem". This is the only Lawson issue I have never been able to nail down where it even starts within code traces.
                    I have seen this issue on both Windows clients and AIX clients, with and without LBI. Honestly, I'm stunned to here a System i client have this issue.

                    Here's what I know:
                    1) You need two windows sessions to confirm the issue (as you state)
                    2) The only way to resolve it is to restart the OS instance (not just Lawson)
                    3) the only symptom is the system starts to gradually slow down, then this issue happens
                    4) I have never been able to reproduce it on-demand

                    Here's what I have deduced:
                    1) I think one of the internal symptoms is an IPC or Messaging issue between latm and ladb
                    2) I think that it's a thread locking issue with lase answering requests from latm, which deadlocks ladb message queues

                    As I say in item #2 in the what I know section, you have to reboot.

                    I'd love Lawson to help track this down, but it happens so rarely, I haven't been able to reproduce it on demand.

                    Kwane
                    Kwane McNeal
                    Veteran Member
                    Posts: 479
                    Veteran Member
                      Now speaking to what Greg has stated, make sure you have the following setup:

                      NOTE: This advise is ONLY appropriate for clients on 9.0.1.6+, 64-bit

                      1) in $GENDIR/java/command/lsserver.properties, you need the max heap set to 2048 (-Xmx2048m)
                      *** This is regardless of LBI usage or not.
                      2) bump your WAS threads up from the default 50, -OR- setup a WAS cluster
                      *** NOTE: While latm opens a direct connect to lsserver.port, the user and DSSO clients do not. They connect via WAS. That means you need to do this if you have LBI.
                      3) If Lawson requests changes to $LAWDIR/system/lsservice.properties for backport, and pool timeouts, request specific documentation as to use use of any setting you don't understand.
                      Walter
                      Advanced Member
                      Posts: 22
                      Advanced Member
                        Hi Kwane: Thanks for the reply. I had a case open with Lawson on this issue earlier. After going back and forth with tthem, they told me I needed to engaged their Professional Services group. Next month, we hope to be on a new iSeries; a POWER7 which has quite a bit more capacity than our current box.
                        This morning the problem began somewhere between 8:03 and 8:15. It's just a guess, but I think it's some process a user runs that triggers this; perhaps a large upload. During the day, there are a lot of users on the system. I find the performance bad but tolerable. My theory is, the more user you have participating in the "Two Window Problem" the better the performance. But towards the end of the day, when there are fewer users, it becomes unbearable.

                        As I noted, when were are having the issue, working in LID is slow as well. So I've gone away from thinking its a WAS problem. We always thought it has something to do with security. The JAVA job that I need to kill before the environment comes down it a server job running on the iSeries that deals with Security (or so it would seem to indicate). What you've deduced seems to make sense.

                        Well at least I'm not the only one that is struggling with this.
                        Brian Allen
                        Veteran Member
                        Posts: 104
                        Veteran Member
                          We have had similar issues on AIX with 9.0.0 and 9.0.1. It's much better now after vertical scaling (more JVMs). lase was usually involved in the issue. We found as long as performace is not so degraded that the Java process was hung, that restarting just lase would often clear things up.
                          Walter
                          Advanced Member
                          Posts: 22
                          Advanced Member
                            Yesterday, we had another incident of this. Going on Brian's post, I retarted lase. That seemed to clear up the problem without having to bounce the environment. However, I think this broke our security. I tried accessing my employee record in HR11 and it said I was not authorized to the employee's information. We still run laua security and my access is wide open. I'm looking into this further but I would welcome any comments.
                            Kwane McNeal
                            Veteran Member
                            Posts: 479
                            Veteran Member
                              Walter,
                              The issue is actually with the way security token info is requested by the lacobrts when a form is spun up by latm, and the global heap it's stored on.

                              When you bounce lase, you should also either bounce latm (risky), or run tmcontrol -rp PLINE PGM for every program on the right side of the tmmon screen. This will reset the heap for each program, and in the case of the former, will reset any caches at the latm level.

                              Personally I suggest the latter, unless you see odd security issues with all open programs, then you must do the former.

                              Kwane
                              Brian Allen
                              Veteran Member
                              Posts: 104
                              Veteran Member
                                We were able to do this successfully several times without restarting latm or causing any issues, but our users are all on Lawson Security.  I'm sorry that did not work in your case.  Kwane's comments may be the better alternative.