High volume transactions causing LSF9 problems

 6 Replies
 0 Subscribed to this topic
 27 Subscribed to this forum
Sort:
Author
Messages
Dr House
New Member Send Private Message
Posts: 3
New Member
We are currently running LSF9, MSP5 w/patches on AIX, in a 4 server environment (web/ldap/oracle/app). We are able to replicate a problem with the Lawson environment not being able to process transactions appropriately. After pushing through lots of tranactions (potentially 50k+) in a short time frame (24 hours), our system begins to behave in a extraordinary fashion - but not in a good way. Basically we have a script that feed lajs batch jobs (jqsubmit -w), but the batch job submission will hang until a command like `jqstatus -a` (or any flag you wish including -h help!) will the transaction return to the command line. This is an indefinite hang - so until another command run against the job scheduler, the jqsubmit will hang, this can mean 5 mins or as long as 2-3 days. Note that we are using the -w parameter when submitting the job. Has anyone ever heard of this before?
Dr House
New Member Send Private Message
Posts: 3
New Member
Let me modify my previous statement that we are using wtsubmit (as jqsubmit -w doesn't exist). We also use `jobload -c filename` - both have the same behavior.
Jimmy Chiu
Veteran Member Send Private Message
Posts: 641
Veteran Member
I would increase the heap size under laconfig to see if that helps. You may have reached the limit of your server hardware also. What's the spec on your app server?
Dr House
New Member Send Private Message
Posts: 3
New Member
I will try that out, we've tried many different things to avoid this problem, adding memory, lawson patches (app and env), oracle patches, websphere patches, tivoli patches, but none have proven to reduce the frequency or avoid the problem. We have our partitions running on a p595 the app server has 16 processors with 65GB of RAM, the DB server has 8 processors & 32 GB of RAM. We currently run websphere, but don't use the portal - our users still come in from the LID. Thanks for the help, I will try upping the heap size and try to break the server again. I'll report back what I find (positive or negative).
Jimmy Chiu
Veteran Member Send Private Message
Posts: 641
Veteran Member
You also mentioned that you have a script to feed lajs batch job. Maybe slow down the script and put some delay in between each submit? Also i would assume with the amount of transaction you are processing, your app server and db server are on multiple fiber connections? Reducing the network latency or delay may help the situation also.
Norm
Veteran Member Send Private Message
Posts: 40
Veteran Member
Sounds like you're real problem is with LAJS since it appears you're using wtsubmit to submit batch jobs. We do something similar, but on a much smaller scale. Ours probably only submits a couple thousand jobs in a 24 hour period. We're not experiencing that perticular proble, but we'll have a problem where some of our jobs simply abort after zero or one second elapsed time. It's almost os if lajs loses track of the process and then kills it. Very weird. I'm not a big fan of LAJS, I don't think it was built to handle volumes of 'automatically' submitted jobs. But that doesn't really help you does it? You say that running something like a jqstatus command frees up the block. Can you use cron (or a similar process) to submit a jqstatus command every minute or so to see if that helps LAJS stay unfrozen?
John Henley
Send Private Message
Posts: 3353
Since this is all running in batch/LID,etc. websphere shouldn't really play a part...it really sounds like you are experiencing a resource/memory leak. Have you looked at basic environment settings, such as ladb.cfg? It could be something really simply (trust me, I've seen crazier stuff) like someone sent the INSERTBUFSIZE really high in the ORACLE file, and forgot to set it back down.
Thanks for using the LawsonGuru.com forums!
John