Introduction to Java CAS principle analysis
Dec 24, 2020 pm 05:37 PMjava basic tutorialColumn introduction and analysis Java CAS
##Recommendation (free): java basic tutorial
1. Introduction
CAS stands for compare and swap. A mechanism for implementing synchronization functionality in a multi-threaded environment. A CAS operation contains three operands -- a memory location, an expected value, and a new value. The implementation logic of CAS is to compare the value at the memory location with the expected value. If they are equal, replace the value at the memory location with the new value. If not equal, no operation is performed. In Java, Java does not directly implement CAS. CAS-related implementations are implemented in the form of C inline assembly. Java code needs to be called through JNI. I will analyze the implementation details in Chapter 3. As mentioned earlier, the process of CAS operation is not difficult. But the above explanation is not enough. Next, I will introduce some other background knowledge. Only with this background knowledge can we better understand the subsequent content.2. Background introduction
We all know that the CPU transmits data through the bus and memory. In the multi-core era, multiple cores communicate with memory and other hardware through the same bus. As shown below:inc dword ptr [...] is equivalent to
DEST = DEST 1. This instruction contains three operations
Read->Modify->Write, involving two memory accesses. Consider a situation where a value of 1 is stored at a specified location in memory. Now both CPU cores execute the instruction at the same time. The process of alternate execution of the two cores is as follows:
- Core 1 reads the value 1 from the specified location in the memory and loads it into the register
- Core 2 reads from the specified location in the memory Value 1 and load it into the register
- Core 1 Decrement the value in the register by 1
- Core 2 Decrement the value in the register by 1
- Core 1 Write the modified value Back to memory
- Core 2 Write the modified value back to memory
Causes the processor's LOCK# signal to be asserted during execution of the accompanying instruction (The key points described above have been used It is highlighted in bold that in a multi-processor environment, the LOCK# signal can ensure that the processor has exclusive use of some shared memory. lock can be added before the following instructions: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG.
turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal ensures that the processor has exclusive use of any shared memory while the signal is asserted.
By adding the lock prefix before the inc instruction, the instruction can be made atomic. When multiple cores execute the same inc instruction at the same time, they will do so in a serial manner, thus avoiding the situation mentioned above. So there is another question here. How does the lock prefix ensure that the core exclusively occupies a certain memory area? The answer is as follows:
In Intel processors, there are two ways to ensure that a certain core of the processor occupies a certain memory area exclusively. The first way is to lock the bus and let a certain core use the bus exclusively, but this is too expensive. After the bus is locked, other cores cannot access the memory, which may cause other cores to stop working for a short time. The second way is to lock the cache, if some memory data is cached in the processor cache. The LOCK# signal issued by the processor does not lock the bus, but locks the memory area corresponding to the cache line. Other processors cannot perform related operations on this memory area while this memory area is locked. Compared with locking the bus, the cost of locking the cache is obviously smaller. Regarding bus locks and cache locks, for a more detailed description, please refer to the Intel Developer’s Manual Volume 3 Software Developer’s Manual, Chapter 8 Multiple-Processor Management.
3. Source code analysis
With the above background knowledge, now we can read the source code of CAS leisurely. The content of this chapter will analyze the compareAndSet method in the atomic class AtomicInteger under the java.util.concurrent.atomic package. The relevant analysis is as follows:
public?class?AtomicInteger?extends?Number?implements?java.io.Serializable?{ ????//?setup?to?use?Unsafe.compareAndSwapInt?for?updates ????private?static?final?Unsafe?unsafe?=?Unsafe.getUnsafe(); ????private?static?final?long?valueOffset; ????static?{ ????????try?{ ????????????//?計算變量?value?在類對象中的偏移 ????????????valueOffset?=?unsafe.objectFieldOffset ????????????????(AtomicInteger.class.getDeclaredField("value")); ????????}?catch?(Exception?ex)?{?throw?new?Error(ex);?} ????} ????private?volatile?int?value; ???? ????public?final?boolean?compareAndSet(int?expect,?int?update)?{ ????????/* ?????????*?compareAndSet?實際上只是一個殼子,主要的邏輯封裝在?Unsafe?的? ?????????*?compareAndSwapInt?方法中 ?????????*/ ????????return?unsafe.compareAndSwapInt(this,?valueOffset,?expect,?update); ????} ???? ????//?...... } public?final?class?Unsafe?{ ????//?compareAndSwapInt?是?native?類型的方法,繼續(xù)往下看 ????public?final?native?boolean?compareAndSwapInt(Object?o,?long?offset, ??????????????????????????????????????????????????int?expected, ??????????????????????????????????????????????????int?x); ????//?...... }
//?unsafe.cpp /* ?*?這個看起來好像不像一個函數(shù),不過不用擔心,不是重點。UNSAFE_ENTRY?和?UNSAFE_END?都是宏, ?*?在預編譯期間會被替換成真正的代碼。下面的?jboolean、jlong?和?jint?等是一些類型定義(typedef): ?*? ?*?jni.h ?*?????typedef?unsigned?char???jboolean; ?*?????typedef?unsigned?short??jchar; ?*?????typedef?short???????????jshort; ?*?????typedef?float???????????jfloat; ?*?????typedef?double??????????jdouble; ?*? ?*?jni_md.h ?*?????typedef?int?jint; ?*?????#ifdef?_LP64?//?64-bit ?*?????typedef?long?jlong; ?*?????#else ?*?????typedef?long?long?jlong; ?*?????#endif ?*?????typedef?signed?char?jbyte; ?*/ UNSAFE_ENTRY(jboolean,?Unsafe_CompareAndSwapInt(JNIEnv?*env,?jobject?unsafe,?jobject?obj,?jlong?offset,?jint?e,?jint?x)) ??UnsafeWrapper("Unsafe_CompareAndSwapInt"); ??oop?p?=?JNIHandles::resolve(obj); ??//?根據(jù)偏移量,計算?value?的地址。這里的?offset?就是?AtomaicInteger?中的?valueOffset ??jint*?addr?=?(jint?*)?index_oop_from_field_offset_long(p,?offset); ??//?調用?Atomic?中的函數(shù)?cmpxchg,該函數(shù)聲明于?Atomic.hpp?中 ??return?(jint)(Atomic::cmpxchg(x,?addr,?e))?==?e; UNSAFE_END //?atomic.cpp unsigned?Atomic::cmpxchg(unsigned?int?exchange_value, ?????????????????????????volatile?unsigned?int*?dest,?unsigned?int?compare_value)?{ ??assert(sizeof(unsigned?int)?==?sizeof(jint),?"more?work?to?do"); ??/* ???*?根據(jù)操作系統(tǒng)類型調用不同平臺下的重載函數(shù),這個在預編譯期間編譯器會決定調用哪個平臺下的重載 ???*?函數(shù)。相關的預編譯邏輯如下: ???*? ???*?atomic.inline.hpp: ???*????#include?"runtime/atomic.hpp" ???*???? ???*????//?Linux ???*????#ifdef?TARGET_OS_ARCH_linux_x86 ???*????#?include?"atomic_linux_x86.inline.hpp" ???*????#endif ???*??? ???*????//?省略部分代碼 ???*???? ???*????//?Windows ???*????#ifdef?TARGET_OS_ARCH_windows_x86 ???*????#?include?"atomic_windows_x86.inline.hpp" ???*????#endif ???*???? ???*????//?BSD ???*????#ifdef?TARGET_OS_ARCH_bsd_x86 ???*????#?include?"atomic_bsd_x86.inline.hpp" ???*????#endif ???*? ???*?接下來分析?atomic_windows_x86.inline.hpp?中的?cmpxchg?函數(shù)實現(xiàn) ???*/ ??return?(unsigned?int)Atomic::cmpxchg((jint)exchange_value,?(volatile?jint*)dest, ???????????????????????????????????????(jint)compare_value); }
The above analysis seems to be more, but the main process is not complicated. . If you don't get hung up on the details of the code, it's relatively easy to understand. Next, I will analyze the Atomic::cmpxchg function under the Windows platform. Read on.
//?atomic_windows_x86.inline.hpp #define?LOCK_IF_MP(mp)?__asm?cmp?mp,?0??\ ???????????????????????__asm?je?L0??????\ ???????????????????????__asm?_emit?0xF0?\ ???????????????????????__asm?L0: ?????????????? inline?jint?Atomic::cmpxchg?(jint?exchange_value,?volatile?jint*?dest,?jint?compare_value)?{ ??//?alternative?for?InterlockedCompareExchange ??int?mp?=?os::is_MP(); ??__asm?{ ????mov?edx,?dest ????mov?ecx,?exchange_value ????mov?eax,?compare_value ????LOCK_IF_MP(mp) ????cmpxchg?dword?ptr?[edx],?ecx ??} }
The above code consists of the LOCK_IF_MP precompiled identifier and the cmpxchg function. To see it a little clearer, let's replace LOCK_IF_MP in the cmpxchg function with the actual content. As follows:
inline?jint?Atomic::cmpxchg?(jint?exchange_value,?volatile?jint*?dest,?jint?compare_value)?{ ??//?判斷是否是多核?CPU ??int?mp?=?os::is_MP(); ??__asm?{ ????//?將參數(shù)值放入寄存器中 ????mov?edx,?dest????//?注意:?dest?是指針類型,這里是把內存地址存入?edx?寄存器中 ????mov?ecx,?exchange_value ????mov?eax,?compare_value ???? ????//?LOCK_IF_MP ????cmp?mp,?0 ????/* ?????*?如果?mp?=?0,表明是線程運行在單核?CPU?環(huán)境下。此時?je?會跳轉到?L0?標記處, ?????*?也就是越過?_emit?0xF0?指令,直接執(zhí)行?cmpxchg?指令。也就是不在下面的?cmpxchg?指令 ?????*?前加?lock?前綴。 ?????*/ ????je?L0 ????/* ?????*?0xF0?是?lock?前綴的機器碼,這里沒有使用?lock,而是直接使用了機器碼的形式。至于這樣做的 ?????*?原因可以參考知乎的一個回答: ?????*?????https://www.zhihu.com/question/50878124/answer/123099923 ?????*/? ????_emit?0xF0 L0: ????/* ?????*?比較并交換。簡單解釋一下下面這條指令,熟悉匯編的朋友可以略過下面的解釋: ?????*???cmpxchg:?即“比較并交換”指令 ?????*???dword:?全稱是?double?word,在?x86/x64?體系中,一個? ?????*??????????word?=?2?byte,dword?=?4?byte?=?32?bit ?????*???ptr:?全稱是?pointer,與前面的?dword?連起來使用,表明訪問的內存單元是一個雙字單元 ?????*???[edx]:?[...]?表示一個內存單元,edx?是寄存器,dest?指針值存放在?edx?中。 ?????*??????????那么?[edx]?表示內存地址為?dest?的內存單元 ?????*?????????? ?????*?這一條指令的意思就是,將?eax?寄存器中的值(compare_value)與?[edx]?雙字內存單元中的值 ?????*?進行對比,如果相同,則將?ecx?寄存器中的值(exchange_value)存入?[edx]?內存單元中。 ?????*/ ????cmpxchg?dword?ptr?[edx],?ecx ??} }
The implementation process of CAS is finished here. The implementation of CAS is inseparable from the support of the processor. There are so many codes above, but the core code is actually a cmpxchg instruction with lock prefix, that is, lock cmpxchg dword ptr [edx], ecx
.
4. ABA problem
When talking about CAS, we basically have to talk about the ABA problem of CAS. CAS consists of three steps, namely "read->compare->writeback". Consider a situation where thread 1 and thread 2 execute CAS logic at the same time. The execution sequence of the two threads is as follows:
- Time 1: Thread 1 performs a read operation and obtains the original value A, and then the thread Switched away
- Time 2: Thread 2 completes the CAS operation and changes the original value from A to B
- Time 3: Thread 2 performs the CAS operation again and changes the original value from B to A
- Moment 4: Thread 1 resumes running, compares the comparison value (compareValue) with the original value (oldValue), and finds that the two values ??are equal. Then write the new value (newValue) into the memory to complete the CAS operation
As in the above process, thread 1 does not know that the original value has been modified, and it seems that there is no change, so it The process will continue to execute. For ABA problems, the usual solution is to set a version number for each CAS operation. The java.util.concurrent.atomic package provides an atomic class AtomicStampedReference that can handle ABA issues. The specific implementation will not be analyzed here. Interested friends can check it out for themselves.
5. Summary
Writing this, this article is finally coming to an end. Although the principle of CAS itself, including its implementation, is not difficult, it is really not easy to write. This involves some low-level knowledge. Although I can understand it, it is still a bit difficult to understand it. Due to my lack of underlying knowledge, some of the above analysis will inevitably be wrong. So if there is an error, please feel free to comment. Of course, it is best to explain why it is wrong. Thank you.
Okay, that’s it for this article. Thanks for reading and bye.
Appendix
The paths to several files used in the previous source code analysis section are posted here. It will help everyone index, as follows:
File name | Path |
---|---|
Unsafe.java | openjdk/jdk/src/share/classes/sun/misc/Unsafe.java |
unsafe.cpp | openjdk/ hotspot/src/share/vm/prims/unsafe.cpp |
atomic.cpp | openjdk/hotspot/src/share/vm/runtime/atomic.cpp |
atomic_windows_x86.inline.hpp | openjdk/hotspot/src/os_cpu/windows_x86/vm/atomic_windows_x86.inline.hpp |
The above is the detailed content of Introduction to Java CAS principle analysis. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

To correctly handle JDBC transactions, you must first turn off the automatic commit mode, then perform multiple operations, and finally commit or rollback according to the results; 1. Call conn.setAutoCommit(false) to start the transaction; 2. Execute multiple SQL operations, such as INSERT and UPDATE; 3. Call conn.commit() if all operations are successful, and call conn.rollback() if an exception occurs to ensure data consistency; at the same time, try-with-resources should be used to manage resources, properly handle exceptions and close connections to avoid connection leakage; in addition, it is recommended to use connection pools and set save points to achieve partial rollback, and keep transactions as short as possible to improve performance.

Use classes in the java.time package to replace the old Date and Calendar classes; 2. Get the current date and time through LocalDate, LocalDateTime and LocalTime; 3. Create a specific date and time using the of() method; 4. Use the plus/minus method to immutably increase and decrease the time; 5. Use ZonedDateTime and ZoneId to process the time zone; 6. Format and parse date strings through DateTimeFormatter; 7. Use Instant to be compatible with the old date types when necessary; date processing in modern Java should give priority to using java.timeAPI, which provides clear, immutable and linear

Pre-formanceTartuptimeMoryusage, Quarkusandmicronautleadduetocompile-Timeprocessingandgraalvsupport, Withquarkusoftenperforminglightbetterine ServerLess scenarios.2.Thyvelopecosyste,

Java's garbage collection (GC) is a mechanism that automatically manages memory, which reduces the risk of memory leakage by reclaiming unreachable objects. 1.GC judges the accessibility of the object from the root object (such as stack variables, active threads, static fields, etc.), and unreachable objects are marked as garbage. 2. Based on the mark-clearing algorithm, mark all reachable objects and clear unmarked objects. 3. Adopt a generational collection strategy: the new generation (Eden, S0, S1) frequently executes MinorGC; the elderly performs less but takes longer to perform MajorGC; Metaspace stores class metadata. 4. JVM provides a variety of GC devices: SerialGC is suitable for small applications; ParallelGC improves throughput; CMS reduces

Networkportsandfirewallsworktogethertoenablecommunicationwhileensuringsecurity.1.Networkportsarevirtualendpointsnumbered0–65535,withwell-knownportslike80(HTTP),443(HTTPS),22(SSH),and25(SMTP)identifyingspecificservices.2.PortsoperateoverTCP(reliable,c

defer is used to perform specified operations before the function returns, such as cleaning resources; parameters are evaluated immediately when defer, and the functions are executed in the order of last-in-first-out (LIFO); 1. Multiple defers are executed in reverse order of declarations; 2. Commonly used for secure cleaning such as file closing; 3. The named return value can be modified; 4. It will be executed even if panic occurs, suitable for recovery; 5. Avoid abuse of defer in loops to prevent resource leakage; correct use can improve code security and readability.

Gradleisthebetterchoiceformostnewprojectsduetoitssuperiorflexibility,performance,andmoderntoolingsupport.1.Gradle’sGroovy/KotlinDSLismoreconciseandexpressivethanMaven’sverboseXML.2.GradleoutperformsMaveninbuildspeedwithincrementalcompilation,buildcac

The clear answer to this question is the recommendation to implement the observer pattern using a custom observer interface. 1. Although Java provides Observable and Observer, the former is a class and has been deprecated and lacks flexibility; 2. The modern recommended practice is to define a functional Observer interface, and the Subject maintains the Observer list and notify all observers when the state changes; 3. It can be used in combination with Lambda expressions to improve the simplicity and maintainability of the code; 4. For GUI or JavaBean scenarios, PropertyChangeListener can be used. Therefore, new projects should adopt a custom observer interface scheme, which is type-safe, easy to test and specializes in modern Java
