Hello everyone! Happy holidays!
I have been using RASPA for the past few months, and recently I started incorporating the charge equilibration feature in my simulation pipeline. Initially, everything went fine, but then I started having problems.
When the ChargeFromChargeEquilibration option is set to yes, the simulation crashes with a Segmentation Fault for some (but not all) CIF files.
One example of a CIF file that leads to a crash is TER.cif from the Database of Zeolite Structures: https://asia.iza-structure.org/IZA-SC/cif/TER.cif.
The simulation.input file is as simple as
SimulationType MonteCarlo
NumberOfCycles 0
Framework 0
FrameworkName TER
UnitCells 3 2 2
ChargeFromChargeEquilibration yes
ChargeEquilibrationPeriodic yes
ChargeEquilibrationEwald yes
SymmetrizeFrameworkCharges no
The purpose of this MonteCarlo simulation is simply to create a new CIF file containing a 3x2x2 supercell with P1 symmetry.
To help with debugging, I recompiled RASPA v2.0.39 with CFLAGS="-w -ggdb -O0" and executed
gdb ~/RASPA2-2.0.39/bin/simulate
(gdb) run
Starting program: ~/RASPA2-2.0.39/bin/simulate
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6492e18 in __strcpy_sse2_unaligned () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff6492e18 in __strcpy_sse2_unaligned () from /lib64/libc.so.6
#1 0x00007ffff718e543 in WriteFrameworkDefinitionShell (string=0x7ffff7bff6f8 "initial") at framework.c:2103
#2 0x00007ffff793de5e in run (inputData=0x408910 "simulation.input", inputCrystal=0x408930 "",
raspaDir=0x7fffffffd8cb "~/RASPA2-2.0.39/", stream=false) at run.c:101
#3 0x00000000004013e1 in main (argc=1, argv=0x7fffffffd398) at main.c:106
(gdb) frame 1
#1 0x00007ffff718e543 in WriteFrameworkDefinitionShell (string=0x7ffff7bff6f8 "initial") at framework.c:2103
2103 strcpy(symbol,PseudoAtoms[Type].ChemicalElement);
(gdb) p Type
$2 = -415391835
(gdb) p PseudoAtoms[Type].ChemicalElement
Cannot access memory at address 0xffffffce7b817de4
The source of the Segmentation Fault seems to be that we are trying to access the PseudoAtoms array at a negative index (Type = -415391835), which does not exist. This occurs at https://github.com/iRASPA/RASPA2/blob/master/src/framework.c#L2103 (https://github.com/iRASPA/RASPA2/blob/master/src/framework.c#L2103).
So I investigated further why/where the Type index variable was set to a negative value and found it at https://github.com/iRASPA/RASPA2/blob/master/src/framework.c#L2101 (https://github.com/iRASPA/RASPA2/blob/master/src/framework.c#L2101).
(gdb) p CurrentSystem
$3 = 0
(gdb) p CurrentFramework
$4 = 0
(gdb) p i
$5 = 0
(gdb) p Framework[CurrentSystem].Atoms[CurrentFramework][i].Type
$6 = -415391835
gdb) p Framework[CurrentSystem].Atoms[CurrentFramework][i]
$14 = {
"Type": -415391835,
"Charge": 0.10504292071040217,
"CFVDWScalingParameter": 0.60642472553108762,
"CFChargeScalingParameter": -0.79514090088772649,
"CFStoredScalingParameter": 0.99604382482586074,
"Modified": 159528,
"OriginalType": -1078542400,
"CreationState": -845529396,
"AssymetricType": 1068764949,
"temp": 0.99692550651848566,
"Position": {
"x": 0.42480330804058136,
"y": 0.9052856728556895,
"z": -0.13602553641904541
},
"AnisotropicPosition": {
"x": 0.99070533128772031,
"y": 0.93363353366682034,
"z": -0.35822956998662936
},
"ReferencePosition": {
"x": -0.73270495495271304,
"y": -0.68054643411580806,
"z": 0.9831752294314593
},
"ReferenceAnisotropicPosition": {
"x": 0.18266490695368287,
"y": 0.74724584781165182,
"z": 0.6645476980083872
},
"RattleReferencePosition": {
"x": 0.99913704083225319,
"y": -0.041535209605439771,
"z": 0.93363353366682034
},
"Velocity": {
"x": -0.35822956998662936,
"y": 0.74724584781165182,
"z": 0.6645476980083872
},
"ReferenceVelocity": {
"x": 0.99913704083225319,
"y": -0.041535209605439771,
"z": 0.93363353366682034
},
"Force": {
"x": -0.35822956998662936,
"y": 0.74724584781165182,
"z": 0.6645476980083872
},
"ReferenceForce": {
"x": 0.99913704083225319,
"y": -0.041535209605439771,
"z": 0.93363353366682034
},
"RattleGradient": {
"x": -0.35822956998662936,
"y": 0.74724584781165182,
"z": 0.6645476980083872
},
"ElectricField": {
"x": 0.99913704083225319,
"y": -0.041535209605439771,
"z": 0.93363353366682034
},
"ReferenceElectricField": {
"x": -0.35822956998662936,
"y": 0.75562487546411949,
"z": 0.65500461645688624
},
"InducedElectricField": {
"x": -0.50000000000000122,
"y": -0.86602540378443804,
"z": -0.50000000000000122
},
"InducedDipole": {
"x": -0.86602540378443804,
"y": -0.50000000000000122,
"z": -0.86602540378443804
},
"HessianIndex": {
"x": 11,
"y": -1075838976,
"z": -396866395
},
"HessianAtomIndex": -1075071366,
"Fixed": {
"x": -1725695833,
"y": -1075481023,
"z": 1040166342
}
}
I notice that both Type and OriginalType are negative, but AssymetricType is positive.
I have read here several times that is typically something in the CIF files that generates these Segmentation Faults. I also agree that is probably easier to fix the non-standard CIF rather than implementing safeguards in the code to deal with rare situations. I hope that I was able to pinpoint the issue well enough to make it easier for the experts identify its origin. I also posted this as an issue in the RASPA Github repository for proper documentation.
Is there anything we can do (preferably) to the CIF file (otherwise, to the code itself) to prevent this Segmentation Fault from happening?
In a similar test, the Segmentation Fault crash also happened with a positive Type = 993687961, so it's not just negative indexes.
Most likely it is related to the encoding of your cif-file (i.e. newline and carriage return which differed between windows and Mac/linux).
Framework[CurrentSystem].Atoms[CurrentFramework].Type
The type seems be nonsensical, so the error must be during the reading of the cif-file.
To quickly debug, run with 1x1x1 and 'ChargeFromChargeEquilibration no', and then check in the output-file that the file is read properly.
Happy New Year, David!
I investigated the encoding hypothesis you raised, but found no explanation so far. The (attached) TER.cif file from https://asia.iza-structure.org/IZA-SC/cif/TER.cif (https://asia.iza-structure.org/IZA-SC/cif/TER.cif) seems to be encoded in plain ASCII.
rneumann@linux:$> file -i TER.cif
TER.cif: text/plain; charset=us-ascii
rneumann@macos:$> file -I TER.cif
TER.cif: text/plain; charset=us-ascii
Also, if I open it in binary mode with
vim -b TER.cif, I don't find a sneaky
^M anywhere. Everything seems pretty normal. In any case, I tried converting the encoding from ASCII to UTF-8 using the
iconv -f ASCII -t UTF-8 TER.cif > TER-utf8.cif command but got the same behaviour when simulating
TER-utf8.cif. The problem does not seem to be in the part that reads the CIF file, because I even get a
End reading cif-file success message
minutes before the crash happens.
The other thing you suggested, though, gave rise to much more interesting results. If I run either
- UnitCells 3 2 2 + ChargeFromChargeEquilibration no
- UnitCells 1 1 1 + ChargeFromChargeEquilibration yes
the calculation runs perfectly to the end.
Could there be an effect that only manifests itself when a combination of large supercell + charge equilibration is present?