Programing

i ++가 스레드로부터 안전하지 않다고 들었습니다. ++ i가 스레드로부터 안전합니까?

lottogame 2020. 9. 6. 11:49
반응형

i ++가 스레드로부터 안전하지 않다고 들었습니다. ++ i가 스레드로부터 안전합니까?


어셈블리에서 원래 값을 임시 어딘가에 저장하고 증분 한 다음 대체하여 컨텍스트 스위치에 의해 중단 될 수 있기 때문에 i ++는 스레드로부터 안전한 문이 아니라고 들었습니다.

그러나 ++ i에 대해 궁금합니다. 내가 알 수있는 한, 이것은 'add r1, r1, 1'과 같은 단일 어셈블리 명령어로 축소되며 하나의 명령어 일 뿐이므로 컨텍스트 전환에 의해 중단 될 수 없습니다.

누구든지 명확히 할 수 있습니까? x86 플랫폼이 사용되고 있다고 가정합니다.


당신은 잘못 들었습니다. 그것은 잘가있을 수 있습니다 "i++"특정 컴파일러와 특정 프로세서 아키텍처에 대한 스레드 안전하지만 모두의 기준에 위임 아니에요. 사실, 멀티 스레딩은 ISO C 또는 C ++ 표준 (a)의 일부가 아니기 때문에 컴파일 될 것이라고 생각하는 것에 따라 스레드로부터 안전한 것으로 간주 할 수 없습니다.

다음 ++i과 같은 임의의 시퀀스로 컴파일 할 수 있습니다.

load r0,[i]  ; load memory into reg 0
incr r0      ; increment reg 0
stor [i],r0  ; store reg 0 back to memory

메모리 증가 명령이없는 내 (가상) CPU에서는 스레드로부터 안전하지 않습니다. 또는 스마트하고 다음과 같이 컴파일 할 수 있습니다.

lock         ; disable task switching (interrupts)
load r0,[i]  ; load memory into reg 0
incr r0      ; increment reg 0
stor [i],r0  ; store reg 0 back to memory
unlock       ; enable task switching (interrupts)

여기서 인터럽트를 lock비활성화하고 unlock활성화 합니다 . 그러나 그럼에도 불구하고 메모리를 공유하는 이러한 CPU 중 두 개 이상이있는 아키텍처에서는 스레드로부터 안전하지 않을 수 있습니다 ( lock한 CPU에 대한 인터럽트 만 비활성화 할 수 있음).

언어 자체 (또는 해당 언어에 내장되지 않은 경우 라이브러리)는 스레드로부터 안전한 구조를 제공하므로 생성되는 기계 코드에 대한 이해 (또는 오해 가능성)에 의존하지 말고이를 사용해야합니다.

Java synchronizedpthread_mutex_lock()(일부 운영 체제에서 C / C ++로 사용 가능) 같은 것들이 (a) .


(a) 이 질문은 C11 및 C ++ 11 표준이 완성되기 전에 질문되었습니다. 이러한 반복은 이제 원자 데이터 유형을 포함하여 언어 사양에 스레딩 지원을 도입했습니다 (하지만 일반적으로 스레드와 스레드 는 적어도 C에서는 선택 사항입니다 ).


++ i 또는 i ++에 대해 포괄적 인 진술을 할 수 없습니다. 왜? 32 비트 시스템에서 64 비트 정수를 증가시키는 것을 고려하십시오. 기본 머신에 "로드, 증가, 저장"명령이 쿼드 단어 인 경우를 제외하고, 해당 값을 증가 시키려면 여러 명령이 필요합니다.이 명령 중 어느 것이 든 스레드 컨텍스트 스위치에 의해 중단 될 수 있습니다.

또한 ++i항상 "가치에 하나를 추가"하는 것은 아닙니다. C와 같은 언어에서 포인터를 늘리면 실제로 가리키는 크기가 추가됩니다. 즉, i32 바이트 구조에 대한 포인터 인 경우 ++i32 바이트를 추가합니다. 거의 모든 플랫폼이 원자적인 "메모리 주소에서 값 증가"명령어를 가지고있는 반면, 모든 플랫폼이 원자 적 "메모리 주소에서 값에 임의의 값 추가"명령어를 가지고있는 것은 아닙니다.


둘 다 스레드에 안전하지 않습니다.

CPU는 메모리로 직접 수학을 수행 할 수 없습니다. 메모리에서 값을로드하고 CPU 레지스터로 수학을 수행하여 간접적으로 수행합니다.

i ++

register int a1, a2;

a1 = *(&i) ; // One cpu instruction: LOAD from memory location identified by i;
a2 = a1;
a1 += 1; 
*(&i) = a1; 
return a2; // 4 cpu instructions

++ i

register int a1;

a1 = *(&i) ; 
a1 += 1; 
*(&i) = a1; 
return a1; // 3 cpu instructions

두 경우 모두 예측할 수없는 i 값을 발생시키는 경쟁 조건이 있습니다.

For example, let's assume there are two concurrent ++i threads with each using register a1, b1 respectively. And, with context switching executed like the following:

register int a1, b1;

a1 = *(&i);
a1 += 1;
b1 = *(&i);
b1 += 1;
*(&i) = a1;
*(&i) = b1;

In result, i doesn't become i+2, it becomes i+1, which is incorrect.

To remedy this, moden CPUs provide some kind of LOCK, UNLOCK cpu instructions during the interval a context switching is disabled.

On Win32, use InterlockedIncrement() to do i++ for thread-safety. It's much faster than relying on mutex.


If you are sharing even an int across threads in a multi-core environment, you need proper memory barriers in place. This can mean using interlocked instructions (see InterlockedIncrement in win32 for example), or using a language (or compiler) that makes certain thread-safe guarantees. With CPU level instruction-reordering and caches and other issues, unless you have those guarantees, don't assume anything shared across threads is safe.

Edit: One thing you can assume with most architectures is that if you are dealing with properly aligned single words, you won't end up with a single word containing a combination of two values that were mashed together. If two writes happen over top of each other, one will win, and the other will be discarded. If you are careful, you can take advantage of this, and see that either ++i or i++ are thread-safe in the single writer/multiple reader situation.


If you want an atomic increment in C++ you can use C++0x libraries (the std::atomic datatype) or something like TBB.

There was once a time that the GNU coding guidelines said updating datatypes that fit in one word was "usually safe" but that advice is wrong for SMP machines, wrong for some architectures, and wrong when using an optimizing compiler.


To clarify the "updating one-word datatype" comment:

It is possible for two CPUs on an SMP machine to write to the same memory location in the same cycle, and then try to propagate the change to the other CPUs and the cache. Even if only one word of data is being written so the writes only take one cycle to complete, they also happen simultaneously so you cannot guarantee which write succeeds. You won't get partially updated data, but one write will disappear because there is no other way to handle this case.

Compare-and-swap properly coordinates between multiple CPUs, but there is no reason to believe that every variable assignment of one-word datatypes will use compare-and-swap.

And while an optimizing compiler doesn't affect how a load/store is compiled, it can change when the load/store happens, causing serious trouble if you expect your reads and writes to happen in the same order they appear in the source code (the most famous being double-checked locking does not work in vanilla C++).

NOTE My original answer also said that Intel 64 bit architecture was broken in dealing with 64 bit data. That is not true, so I edited the answer, but my edit claimed PowerPC chips were broken. That is true when reading immediate values (i.e., constants) into registers (see the two sections named "Loading pointers" under listing 2 and listing 4) . But there is an instruction for loading data from memory in one cycle (lmw), so I've removed that part of my answer.


On x86/Windows in C/C++, you should not assume it is thread-safe. You should use InterlockedIncrement() and InterlockedDecrement() if you require atomic operations.


If your programming language says nothing about threads, yet runs on a multithreaded platform, how can any language construct be thread-safe?

As others pointed out: you need to protect any multithreaded access to variables by platform specific calls.

There are libraries out there that abstract away the platform specificity, and the upcoming C++ standard has adapted it's memory model to cope with threads (and thus can guarantee thread-safety).


Even if it is reduced to a single assembly instruction, incrementing the value directly in memory, it is still not thread safe.

When incrementing a value in memory, the hardware does a "read-modify-write" operation: it reads the value from the memory, increments it, and writes it back to memory. The x86 hardware has no way of incrementing directly on the memory; the RAM (and the caches) is only able to read and store values, not modify them.

Now suppose you have two separate cores, either on separate sockets or sharing a single socket (with or without a shared cache). The first processor reads the value, and before it can write back the updated value, the second processor reads it. After both processors write the value back, it will have been incremented only once, not twice.

There is a way to avoid this problem; x86 processors (and most multi-core processors you will find) are able to detect this kind of conflict in hardware and sequence it, so that the whole read-modify-write sequence appears atomic. However, since this is very costly, it is only done when requested by the code, on x86 usually via the LOCK prefix. Other architectures can do this in other ways, with similar results; for instance, load-linked/store-conditional and atomic compare-and-swap (recent x86 processors also have this last one).

Note that using volatile does not help here; it only tells the compiler that the variable might have be modified externally and reads to that variable must not be cached in a register or optimized out. It does not make the compiler use atomic primitives.

The best way is to use atomic primitives (if your compiler or libraries have them), or do the increment directly in assembly (using the correct atomic instructions).


Never assume that an increment will compile down to an atomic operation. Use InterlockedIncrement or whatever similar functions exist on your target platform.

Edit: I just looked up this specific question and increment on X86 is atomic on single processor systems, but not on multiprocessor systems. Using the lock prefix can make it atomic, but it's much more portable just to use InterlockedIncrement.


The 1998 C++ standard has nothing to say about threads, although the next standard (due this year or the next) does. Therefore, you can't say anything intelligent about thread-safety of operations without referring to the implementation. It's not just the processor being used, but the combination of the compiler, the OS, and the thread model.

In the absence of documentation to the contrary, I wouldn't assume that any action is thread-safe, particularly with multi-core processors (or multi-processor systems). Nor would I trust tests, as thread synchronization problems are likely to come up only by accident.

Nothing is thread-safe unless you have documentation that says it is for the particular system you're using.


According to this assembly lesson on x86, you can atomically add a register to a memory location, so potentially your code may atomically execute '++i' ou 'i++'. But as said in another post, the C ansi does not apply atomicity to '++' opération, so you cannot be sure of what your compiler will generate.


I think that if the expression "i++" is the only in a statement, it's equivalent to "++i", the compiler is smart enough to not keep a temporal value, etc. So if you can use them interchangeably (otherwise you won't be asking which one to use), it doesn't matter whichever you use as they're almost the same (except for aesthetics).

Anyway, even if the increment operator is atomic, that doesn't guarantee that the rest of the computation will be consistent if you don't use the correct locks.

If you want to experiment by yourself, write a program where N threads increment concurrently a shared variable M times each... if the value is less than N*M, then some increment was overwritten. Try it with both preincrement and postincrement and tell us ;-)


For a counter, I recommend a using the compare and swap idiom which is both non locking and thread-safe.

Here it is in Java:

public class IntCompareAndSwap {
    private int value = 0;

    public synchronized int get(){return value;}

    public synchronized int compareAndSwap(int p_expectedValue, int p_newValue){
        int oldValue = value;

        if (oldValue == p_expectedValue)
            value = p_newValue;

        return oldValue;
    }
}

public class IntCASCounter {

    public IntCASCounter(){
        m_value = new IntCompareAndSwap();
    }

    private IntCompareAndSwap m_value;

    public int getValue(){return m_value.get();}

    public void increment(){
        int temp;
        do {
            temp = m_value.get();
        } while (temp != m_value.compareAndSwap(temp, temp + 1));

    }

    public void decrement(){
        int temp;
        do {
            temp = m_value.get();
        } while (temp > 0 && temp != m_value.compareAndSwap(temp, temp - 1));

    }
}

Throw i into thread local storage; it isn't atomic, but it then doesn't matter.


You say "it's only one instruction, it'd be uninterruptible by a context switch." - that's all well and good for a single CPU, but what about a dual core CPU? Then you can really have two threads accessing the same variable at the same time without any context switches.

Without knowing the language, the answer is to test the heck out of it.

참고URL : https://stackoverflow.com/questions/680097/ive-heard-i-isnt-thread-safe-is-i-thread-safe

반응형