ECO NUMBER: ALPSYS11_071 ----------- PRODUCT: OpenVMS Alpha Operating System -------- UPDATED PRODUCT: OpenVMS Alpha Operating System 7.1 ---------------- APPRX BLCK SIZE: 3841 ---------------- COVER LETTER 1 KIT NAME: ALPSYS11_071 2 KITS SUPERSEDED BY THIS KIT: ALPSYS10_071 3 KIT DESCRIPTION: 3.1 Version(s) of OpenVMS to which this kit may be applied: OpenVMS Alpha V7.1, V7.1-1H1 3.2 In order to receive the full fixes listed in this kit the following remedial kits (or supersedants) also need to be installed: ALPSHAD03_071 (refer to related notice in problem description text) ALPSYS03_071 - to receive the full compliment of Fast-I/O fixes. 3.3 Files patched or replaced: o [SYS$LDR]SYS$VM.EXE (new image) o [SYS$LDR]IO_ROUTINES.EXE (new image) o [SYS$LDR]IO_ROUTINES_MON.EXE (new image) o [SYS$LDR]IMAGE_MANAGEMENT.EXE (new image) o [SYS$LDR]PROCESS_MANAGEMENT.EXE (new image) o [SYS$LDR]PROCESS_MANAGEMENT_MON.EXE (new image) 4 PROBLEMS ADDRESSED IN ALPSYS11_071 KIT o When turning on the new multithread features with either LINK/THREADS or via THREACP, register corruption can occur. The registers R2 through R7 can be sign extended if a pagefault occurs and the EXEC performs a pagefault upcall to DECthreads. o A multithreaded process with upcalls enabled may hang in HIB due to a missing event flag upcall. If the program makes heavy use of event flags for synchronization, notification of an -- COVER LETTER -- Page 2 14 November 1997 event flag being set can sometimes be lost. This can result in a thread waiting forever for an event which already took place. 5 PROBLEMS ADDRESSED IN ALPSYS10_071 KIT o If the Job Controller's mailbox should be full at the time a batch job process termination message is sent to the JOB_CONTROL process, the message could be dropped and lost. This could result in SHOW QUEUE showing "executing" jobs with no associated process on the system. o Crash (ACCVIO) in SYSCREDEL.MAR o System crash due to non-paged pool leak. When an Oracle database is backed up using FAST-I/O, non-paged pool fills up with DIOBMs. o The system crashes with a SSRVEXCEPT, Unexpected system service exception. The crash footprint is: SDA> CLUE CRASH Crash Time: Bugcheck Type: SSRVEXCEPT, Unexpected system service exception Node: XXXXXX (Clustered) CPU Type: AlphaServer 1000A 5/400 VMS Version: V7.1 Current Process: OLS1 Current Image: $1$DKA4:[MSY30_5.]OLPR.EXE;1 Failing PC: FFFFFFFF.800CA390 EXE$CLONE_ADDRESS_SPACE_64_C+00A40 Failing PS: 0C000000.00000003 Module: SYS$VM Offset: 00006390 Failing Instruction: EXE$CLONE_ADDRESS_SPACE_64_C+00A40: LDL R2,#X0010(R2) 6 PROBLEMS ADDRESSED IN ALPSYS08_071 KIT o Pagefault at IPL too high: PGFIPLHI bugcheck. Typically this will happen while running Oracle7 R7.3.2.3.2. -- COVER LETTER -- Page 3 14 November 1997 7 PROBLEMS ADDRESSED IN ALPSYS06_071 KIT o A PT page is being processed for a process that is also an outswap target. The SWAPPER already marked all valid and modified pages "delete contents (DELCON)" and decremented the valid page count in the PT page's PFN record. This allowed PROCESS_PT_REQUEST to see a -1 in PT_VAL_CNT and to encounter a valid PTE while the page table page was scanned. This situation was believed to be an inconsistent MMG state. o Audits are generated for mapping to memory resident global sections even though auditing is not enabled for the object. o Code was not dealing with the input from the 64-bit system services properly in descending regions. The VA is set to the last byte within the page. If the page is invalid, the code touches the page to fault it into memory. If the following page is set to no access, the system crashes. o This is a "day 1" bug in the fast-i/o code. We burden the ACB_QUOTA flag with double duty. The rest of the system uses this flag as an indicator whether AST quota have been charged and must be returned upon AST delivery; fast-i/o never charges AST quota but sets the flag anyway as an indication that an AST was requested. The flag must be cleared before the AST is queued. This must be done in three cases: o fast_finish o IPL4 completion o SIMREQCOM The original code cleared the flag only for IPL4 completion. o A change to clustered page deletion tried to close a hole that had the potential to lead to dramatic failures under very rare conditions. The fix was not quite right and led to a loss of pagefile quota under more easily achievable conditions. o System-wide counters for direct and buffered I/Os may become inflated when Fast-I/O is used. o Fast-I/O ($io_perform) transfers a max of 127 blocks to SCSI disks regardless of request size - but reports the full size as requested in the IOSA. o A serial console on a Turbolaser (as opposed to a remote or LAT console) may see device timeouts after issuing commands such as $ directory or $ show device. o Support Global Buffer Objects (achieve VAX parity; enable existing code and complete the port). -- COVER LETTER -- Page 4 14 November 1997 o Support buffer objects without a system space window on memory resident sections only. Also included several fixes for buffered Fast-I/O and Fast-I/O through VIOC. Problem symptoms were possible process hangs and bad data returned in IOSA. o Fix possible system hang when a memory resident section is created. o Fix possible bugcheck when a memory resident section is deleted. o Fix possible process hang when an image exits with outstanding Fast-I/O. o Fix VIOC/Fast-I/O interaction. Bad data could be returned in IOSA and probed buffers unnecessarily for Fast-I/O. o Fix problems with AVOID_PREEMPT. There are two separate bugs which can allow a non-privileged user to crash the system. o There is an omission in the original submission for the support of 32-bit signals as generated by nonprivileged usermode images linked /NOSYSSHR under V6.2 or earlier. A check was not being made in one particular path. *** Notice *** The following problems will be corrected if both this remedial kit and the ALPSHAD03_071 (or supersedant) remedial kit are installed on the customer's system. Therefore, in order to get the complete list of fixes, customers should install both kits. However, either of these kits will run safely without the other kit installed. o SDA> SHOW POOL can take an excessive period of time. o SHOW POOL gives NOSUCHPOOL errors unnecessarily. o SHOW POOL/SUMMARY counts and space totals do not match. o SHOW POOL can not always find the range. o When minimum SYSTEM_PRIMITIVES is in use, SDA will not work instead of signaling the correct message. o The symbol file is opened by SDA even when /OVERRIDE specified (and it is not used). o SDA can get into a loop printing blank lines. o Some of BUGCHECK's messages are confusing. -- COVER LETTER -- Page 5 14 November 1997 o The Base SVA of buffer objects is only displayed as 32 bits. o An incomplete dump is inaccessible by SDA. The changes in this remedial kit will now treat DUMPINCOMPL as a warning if this is a selective dump and the dump has progressed far enough to dump the first process. o SDA SHOW EXEC does not always display all execlets. READ/EXEC does not read all the symbols. o MODIFY DUMP does not work on the dump header and /CONFIRM fails when the field being updated is a byte or a word and the original value is negative. o BUGCHECK's two public routines, (EXE$BUGCHK_REMOVE_VA, EXE$BUGCHK_CANCEL_REMOVE_VA), do not synchronize their manipulations with spinlocks. o BUGCHECK fails if the only process is the swapper. o Handling of Halt/Restart crashes when the Halt HWPCB is used is faulty. o SHOW DEV MC only allows /HOME but it is documented as /HOMEPAGE. 8 KIT INSTALLATION RATING: The following kit installation rating, based upon current CLD information, is provided to serve as a guide as to which customers should apply this remedial kit. (Reference attached Disclaimer of Warranty and Limitation of Liability Statement) INSTALLATION RATING: 3 : To be installed by customers experiencing the problems corrected. 9 INSTALLATION INSTRUCTIONS: Install this kit with the VMSINSTAL utility by logging into the SYSTEM account, and typing the following at the DCL prompt: @SYS$UPDATE:VMSINSTAL ALPSYS11_071 [location of the saveset] The saveset location may be a tape drive, or a disk directory that contains the kit saveset. System should be rebooted after successful installation of the kit. If you have other nodes in your VMScluster, they should also be rebooted in order to make use of the new image(s). -- COVER LETTER -- Page 6 14 November 1997 Copyright (c) Digital Equipment Corporation, 1997 All Rights Reserved. Unpublished rights reserved under the copyright laws of the United States. The software contained on this media is proprietary to and embodies the confidential technology of Digital Equipment Corporation. Possession, use, or dissemination of the software and media is authorized only pursuant to a valid written license from Digital Equipment Corporation. DISCLAIMER OF WARRANTY AND LIMITATION OF LIABILITY THIS PATCH IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND. ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE HEREBY EXCLUDED TO THE EXTENT PERMITTED BY APPLICABLE LAW. IN NO EVENT WILL DIGITAL BE LIABLE FOR ANY LOST REVENUE OR PROFIT, OR FOR SPECIAL, INDIRECT, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, WITH RESPECT TO ANY PATCH MADE AVAILABLE HERE OR TO THE USE OF SUCH PATCH.