LASE dies

 19 Replies
 0 Subscribed to this topic
 27 Subscribed to this forum
Sort:
Author
Messages
Brian K
Advanced Member
Posts: 20
Advanced Member
    We went up on LSF9.0.0.5 in February of 2009.  Since then we have had issues with LASE dying every once and awhile.  We have been working with Lawson on resolving this issue.  It occures (on average) once every other week.  We can restart the lase process and everything is OK, but we are really trying to figure out how to overcome the issue.

    We have applied about 4 PTs that have not corrected the issue.  Is there anyone else that is or has experienced this issue?  Did you resolve it?

    Thanks!
    Brian
    Jimmy Chiu
    Veteran Member
    Posts: 641
    Veteran Member
      Check your lase.log under LAWDIR/system, is it filling up Got exception while reading from connection errors? I have to clear the log every other week because it's filling up too fast.
      Brian K
      Advanced Member
      Posts: 20
      Advanced Member
        Actually, we archive the log files everyday, but the following (with the last error repeated ) is all that we're getting in lase.log:

        Tue Aug 18 07:00:08 2009: Security Environment terminated with an exit status of: 0
        Tue Aug 18 07:00:08 2009: Security Environment Version 9.0.0.5.602 2009-05-27 04:00:00 (200805) Stopped.

        Tue Aug 18 07:00:09 2009: Security server 'default' failed to stop, killed.


        08/18/2009 07:00:25 getuserenv: Pid 241918
        241918: Could not get SecCtx
        241918: Error number = 111
        241918: Error message =
        241918: Error occured: Error (Result=1)
        241918: UserName = lawson

        Alex Tsekhansky
        Veteran Member
        Posts: 92
        Veteran Member
          Brian - that message is dated 7am. Was that when LASE died, or was that when you brought up (or down) the system before/asfter the backup?

          I would be curious to see messages around LASE dying event.

          Thanks.

          Alex.
          Brian K
          Advanced Member
          Posts: 20
          Advanced Member
            Hi Alex,
            Those are the messages around lase dying. I think that it died at 7:00:09, and then until we brought it back up, everytime a user tried to access something, we got the thread:

            08/18/2009 07:00:25 getuserenv: Pid 241918
            241918: Could not get SecCtx
            241918: Error number = 111
            241918: Error message =
            241918: Error occured: Error (Result=1)
            241918: UserName = lawson

            With different UserNames.

            Then at 7:18, restarted lase and we get this in the log file:

            Tue Aug 18 07:18:39 2009: Timeout value is adjusted to 30 Secs

            Tue Aug 18 07:18:39 2009: Security Environment Version 9.0.0.5.602 2009-05-27 04:00:00 (200805) starting.

            Tue Aug 18 07:18:39 2009: Security Environment Version 9.0.0.5.602 2009-05-27 04:00:00 (200805) started.

            Security Server: Initializing default...
            Brian K
            Advanced Member
            Posts: 20
            Advanced Member
              Also, to clarify, at 7:18/7:17, we restarted lase.
              Lonnie
              New Member
              Posts: 1
              New Member
                Our LASE issues have always been related to LDAP server going down.  At least that has been our experience.
                John Henley
                Posts: 3353
                  Brian, are you on Windows or Unix or iSeries? What LDAP are you using-TDS or ADAM?
                  Thanks for using the LawsonGuru.com forums!
                  John
                  John Henley
                  Posts: 3353
                    If you're running on TDS, one issue is that DB2 requires some "care and feeding", and if not properly attended to, causes TDS to fail, in turn bringing LASE down.

                    Thanks for using the LawsonGuru.com forums!
                    John
                    Jimmy Chiu
                    Veteran Member
                    Posts: 641
                    Veteran Member
                      Is there a error entry in your LADB log file corresponds to the LASE log file at the time of failure?
                      Brian K
                      Advanced Member
                      Posts: 20
                      Advanced Member
                        How do we "care" for TDS? We asked Lawson this too and only were given one thing to do to clean up TDS.

                        We run Unix, AIX, TDS -DB2, and have an Oracle database.
                        Brian K
                        Advanced Member
                        Posts: 20
                        Advanced Member
                          In ladb, we only get errors after the fact:

                          Tue Aug 18 07:00:25 2009: [PO30] GetDbAuthInfo() Cannot access PROD with a null RMId

                          Tue Aug 18 07:01:32 2009: [PO20] GetDbAuthInfo() Cannot access PROD with a null RMId


                          Which to me, these make sense. LASE is dead and doesn't know what the RMid is, so it calls it null.
                          John Henley
                          Posts: 3353
                            the question was whether or not there are errors in the LDAP log, not LADB log.


                            Thanks for using the LawsonGuru.com forums!
                            John
                            Brian K
                            Advanced Member
                            Posts: 20
                            Advanced Member
                              I couldn't find anything in LDAP logs around the time that LASE died.

                              I am very inexperienced with TDS, but the log files that I have checked in the past are:
                              /home/idsldap/sqllib/db2dump
                              /usr/lsfprod/idsslapd-idsldap/logs

                              Are there any others that I should be interested in?
                              Bart Conger
                              Advanced Member
                              Posts: 18
                              Advanced Member
                                Validate your sso/ldap resource data is good.

                                 Run:
                                ssoconfig -c
                                Option 5 (manage)
                                Option 6 (export service and identity info)
                                Option 1 (export ALL)
                                Enter ALL (export ALL identities)
                                Save the file.

                                search for null:
                                grep -i null | lashow or > to file

                                See if there is an actual record with null for a value.  If so, you may should be able to then delete the record.


                                Mike Gauthier
                                New Member
                                Posts: 2
                                New Member
                                  I can't offer Unix specific help, but this may be something to consider.

                                  We run i/Series with TDS, and we recently had an issue where a pending constraint on an LDAP object cause LASE to fail. We had to find and change the status of the pending constraint.

                                  One thing that lead us to the error was the job log for IBM Directory Server (QDIRSRV). The joblog indicated a SQL error occuring when were trying to launch LASE. IBM documents indicates that when a constraint is in "check pending" status - reads are not allowed.

                                  Brian K
                                  Advanced Member
                                  Posts: 20
                                  Advanced Member
                                    Hi Bart,
                                    I did what you asked just to make sure, but there are no null entries (except in the SSOP entry for HTTPS) for users.

                                    The reason why I am getting NULL in the error file is because lase died. If you log into portal, then stoplase, and then in portal try to hit a form, you will get the same entries in your lase.log. At least that's how it is in our test and prod environments.

                                    Thanks for the suggestion.

                                    Brian
                                    Jwiff
                                    Basic Member
                                    Posts: 12
                                    Basic Member
                                      Brian,
                                      We had something very similar at the begining of the month. Is your LDAP on a different server than you application? (ours is)
                                      Bottomline: We had an intemitent problem with the DNS server between the two. Only found it because we noticed a pattern that it would occur on the :46 & :52 after each hour. Ends up that some other app in the system had a job that did a mess of DNS lookups and slowed down our system enough at those times to cause the problem. Found it by running ping from the app server to the LDAP -- it would stop pinging, the SecCtx error would occur, and a RMid type error would show up on either the ladb or latm log, just depends on what folks were doing. Network folks said we were doing a 'reverse lookup' and the primary DNS server had problems (ends up a bad fan), bounced to the secondary, but because it was a 'reverse lookup', bounced between the 2 long enough to time out - causing the 'LDAP' error.

                                      On a different thought (not related to SecCtx errors), when we first installed LSF9006, lase would not stop until we stopped it twice. We are not sure if we found the solution, but (were use LID) when we delete users, we were not removing them from the LDAP (using the Lawson Security Administrator). Found that order counted too. Now: delusers (from LID); remove from LSA; remove user from UNIX (smit). A thread on this is on the LAWSON Community.

                                      Good luck.
                                      Jimmy Chiu
                                      Veteran Member
                                      Posts: 641
                                      Veteran Member
                                        Brian,

                                        Can you check your ldap server log and and check for yor ldap server network connection error during the time lase went down? I am starting to think your LDAP server had network outage or was dropped from the domain briefly. Something external maybe causing this.

                                        Brian K
                                        Advanced Member
                                        Posts: 20
                                        Advanced Member
                                          All of the log files and network admins have confirmed that there were no network "hiccups" at that time. We have tried to simulate other potential issues like Reverse DNS Lookup issues and such and we can't make it fail in our test servers.

                                          We have continued to have this issue, but are hoping to upgrade to a new version of the environment to see if this is the issue.

                                          Thanks for all of your input.