News:

SMF - Just Installed!

Main Menu

Charge Equilibration leads to Segmentation Fault

Started by neumannrf, December 23, 2020, 04:26:56 PM

Previous topic - Next topic

neumannrf

Hello everyone! Happy holidays!

I have been using RASPA for the past few months, and recently I started incorporating the charge equilibration feature in my simulation pipeline. Initially, everything went fine, but then I started having problems.

When the ChargeFromChargeEquilibration option is set to yes, the simulation crashes with a Segmentation Fault for some (but not all) CIF files.

One example of a CIF file that leads to a crash is TER.cif from the Database of Zeolite Structures: https://asia.iza-structure.org/IZA-SC/cif/TER.cif.

The simulation.input file is as simple as
SimulationType                  MonteCarlo
NumberOfCycles                  0
Framework                       0
FrameworkName                   TER
UnitCells                       3 2 2

ChargeFromChargeEquilibration   yes
ChargeEquilibrationPeriodic     yes
ChargeEquilibrationEwald        yes
SymmetrizeFrameworkCharges      no


The purpose of this MonteCarlo simulation is simply to create a new CIF file containing a 3x2x2 supercell with P1 symmetry.

To help with debugging, I recompiled RASPA v2.0.39 with CFLAGS="-w -ggdb -O0" and executed

gdb ~/RASPA2-2.0.39/bin/simulate

(gdb) run
Starting program: ~/RASPA2-2.0.39/bin/simulate
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6492e18 in __strcpy_sse2_unaligned () from /lib64/libc.so.6

(gdb) bt
#0  0x00007ffff6492e18 in __strcpy_sse2_unaligned () from /lib64/libc.so.6
#1  0x00007ffff718e543 in WriteFrameworkDefinitionShell (string=0x7ffff7bff6f8 "initial") at framework.c:2103
#2  0x00007ffff793de5e in run (inputData=0x408910 "simulation.input", inputCrystal=0x408930 "",
    raspaDir=0x7fffffffd8cb "~/RASPA2-2.0.39/", stream=false) at run.c:101
#3  0x00000000004013e1 in main (argc=1, argv=0x7fffffffd398) at main.c:106

(gdb) frame 1
#1  0x00007ffff718e543 in WriteFrameworkDefinitionShell (string=0x7ffff7bff6f8 "initial") at framework.c:2103
2103         strcpy(symbol,PseudoAtoms[Type].ChemicalElement);

(gdb) p Type
$2 = -415391835

(gdb) p PseudoAtoms[Type].ChemicalElement
Cannot access memory at address 0xffffffce7b817de4


The source of the Segmentation Fault seems to be that we are trying to access the PseudoAtoms array at a negative index (Type = -415391835), which does not exist. This occurs at https://github.com/iRASPA/RASPA2/blob/master/src/framework.c#L2103.

So I investigated further why/where the Type index variable was set to a negative value and found it at https://github.com/iRASPA/RASPA2/blob/master/src/framework.c#L2101.

(gdb) p CurrentSystem
$3 = 0

(gdb) p CurrentFramework
$4 = 0

(gdb) p i
$5 = 0

(gdb) p Framework[CurrentSystem].Atoms[CurrentFramework][i].Type
$6 = -415391835

gdb) p Framework[CurrentSystem].Atoms[CurrentFramework][i]
$14 = {
    "Type": -415391835,
    "Charge": 0.10504292071040217,
    "CFVDWScalingParameter": 0.60642472553108762,
    "CFChargeScalingParameter": -0.79514090088772649,
    "CFStoredScalingParameter": 0.99604382482586074,
    "Modified": 159528,
    "OriginalType": -1078542400,
    "CreationState": -845529396,
    "AssymetricType": 1068764949,
    "temp": 0.99692550651848566,
    "Position": {
        "x": 0.42480330804058136,
        "y": 0.9052856728556895,
        "z": -0.13602553641904541
    },
    "AnisotropicPosition": {
        "x": 0.99070533128772031,
        "y": 0.93363353366682034,
        "z": -0.35822956998662936
    },
    "ReferencePosition": {
        "x": -0.73270495495271304,
        "y": -0.68054643411580806,
        "z": 0.9831752294314593
    },
    "ReferenceAnisotropicPosition": {
        "x": 0.18266490695368287,
        "y": 0.74724584781165182,
        "z": 0.6645476980083872
    },
    "RattleReferencePosition": {
        "x": 0.99913704083225319,
        "y": -0.041535209605439771,
        "z": 0.93363353366682034
    },
    "Velocity": {
        "x": -0.35822956998662936,
        "y": 0.74724584781165182,
        "z": 0.6645476980083872
    },
    "ReferenceVelocity": {
        "x": 0.99913704083225319,
        "y": -0.041535209605439771,
        "z": 0.93363353366682034
    },
    "Force": {
        "x": -0.35822956998662936,
        "y": 0.74724584781165182,
        "z": 0.6645476980083872
    },
    "ReferenceForce": {
        "x": 0.99913704083225319,
        "y": -0.041535209605439771,
        "z": 0.93363353366682034
    },
    "RattleGradient": {
        "x": -0.35822956998662936,
        "y": 0.74724584781165182,
        "z": 0.6645476980083872
    },
    "ElectricField": {
        "x": 0.99913704083225319,
        "y": -0.041535209605439771,
        "z": 0.93363353366682034
    },
    "ReferenceElectricField": {
        "x": -0.35822956998662936,
        "y": 0.75562487546411949,
        "z": 0.65500461645688624
    },
    "InducedElectricField": {
        "x": -0.50000000000000122,
        "y": -0.86602540378443804,
        "z": -0.50000000000000122
    },
    "InducedDipole": {
        "x": -0.86602540378443804,
        "y": -0.50000000000000122,
        "z": -0.86602540378443804
    },
    "HessianIndex": {
        "x": 11,
        "y": -1075838976,
        "z": -396866395
    },
    "HessianAtomIndex": -1075071366,
    "Fixed": {
        "x": -1725695833,
        "y": -1075481023,
        "z": 1040166342
    }
}


I notice that both Type and OriginalType are negative, but AssymetricType is positive.

I have read here several times that is typically something in the CIF files that generates these Segmentation Faults. I also agree that is probably easier to fix the non-standard CIF rather than implementing safeguards in the code to deal with rare situations. I hope that I was able to pinpoint the issue well enough to make it easier for the experts identify its origin. I also posted this as an issue in the RASPA Github repository for proper documentation.

Is there anything we can do (preferably) to the CIF file (otherwise, to the code itself) to prevent this Segmentation Fault from happening?

neumannrf

In a similar test, the Segmentation Fault crash also happened with a positive Type = 993687961, so it's not just negative indexes.

David Dubbeldam

Most likely it is related to the encoding of your cif-file (i.e. newline and carriage return which differed between windows and Mac/linux).

Framework[CurrentSystem].Atoms[CurrentFramework].Type
The type seems be nonsensical, so the error must be during the reading of the cif-file.

To quickly debug, run with 1x1x1 and 'ChargeFromChargeEquilibration    no', and then check in the output-file that the file is read properly.

neumannrf

Happy New Year, David!

I investigated the encoding hypothesis you raised, but found no explanation so far. The (attached) TER.cif file from https://asia.iza-structure.org/IZA-SC/cif/TER.cif seems to be encoded in plain ASCII.

rneumann@linux:$>  file -i TER.cif
TER.cif: text/plain; charset=us-ascii

rneumann@macos:$>  file -I TER.cif
TER.cif: text/plain; charset=us-ascii


Also, if I open it in binary mode with vim -b TER.cif, I don't find a sneaky ^M anywhere. Everything seems pretty normal. In any case, I tried converting the encoding from ASCII to UTF-8 using the iconv -f ASCII -t UTF-8 TER.cif > TER-utf8.cif command but got the same behaviour when simulating TER-utf8.cif. The problem does not seem to be in the part that reads the CIF file, because I even get a End reading cif-file success message minutes before the crash happens.



The other thing you suggested, though, gave rise to much more interesting results. If I run either

  • UnitCells  3 2 2 + ChargeFromChargeEquilibration   no
  • UnitCells  1 1 1 + ChargeFromChargeEquilibration   yes
the calculation runs perfectly to the end.

Could there be an effect that only manifests itself when a combination of large supercell + charge equilibration is present?