Tuesday, 6 December 2022

Where did the disk space go?

I like a good mystery. Like this evening when I saw 130GB space free on my 800GB disk partition with only 280 GB of space being reported as used. Where did all the rest of the disk go?!

First up, I ran WinDirStat as an administrator, no, still reported 280GB. What was going on here?

One cause for this apparent mismatch was System Restore points, which don't appear on Disk Usage reports, so to confirm I went and took a look.

In the search box, type Control Panel and open the Control Panel App:

Open All Control Panel Items > Recovery and select Configure System Restore:


Under the System Protection tab open Configure...


The maximum size for Disk Space Usage was set to the size of the disk and 400GB had already been used...


So, setting the Max Usage back down to 10% and there, I now had a decent amount of free space.




Saturday, 26 November 2022

BAD_SYSTEM_CONFIG_INFO

Oh dear. Monday saw a bit of stress with the Desktop PC failing to start. It booted into a blue screen with BAD_SYSTEM_CONFIG_INFO.

After it automatically tried to restore, with no success I tried the steps below

Error 0x74: Bad_system_config_info - Microsoft Support

  • System Restore: No restore point found 
  • System image recovery: Made it 50% through and said it couldn't continue
  • Safe mode: Wouldn't start in any safe mode configurations. Always returned to the same blue screen

At this point I was out of options so as a last resort decided to reinstall from an ISO image

I downloaded the media creation tool from: Download Windows 10 (microsoft.com) and flashed to a USB stick. 8GB is required for this.

Then on I needed to enter the BIOS (holding Esc on startup for this) and set it to boot from the USB.

After this everything went very smoothly. My PC is connected via Ethernet to the network so nothing was required to setup the internet connection and the windows license was still applied after installation.

Finally, and most fortunately, I had actually made a full backup of the PC only the day before so I hadn't lost any data. The last step to restore was painless.

As to why it failed like this? I don't know. I ran CHKDSK afterwards and nothing untoward was reported. Perhaps a bad Windows update? Although I didn't see any reports of issues like this. 


Tuesday, 22 March 2022

Setting up a new Python environment

Over many years of working in software development, managing dependencies to you code, whether in binary or source code format has always been a pain, to be honest. Take your pick from:

  • Cut and paste code
  • Linking against common libraries
  • Coping .zip files (with suitably convoluted folder structure)
  • Installing files to a common location
    • dlls
    • .NET GAC
  • Maven
  • NuGet
  • And so on..

And now into the mix comes Python with its 2.x (OK not really now) and 3.x versions, pip and conda..

A discussion of pip and conda can be found here

Anyway, having installed a new Ubuntu partition I wanted to get back the development environment I'd been using for the last couple of years. This is what I did...

Install Visual Studio Code

It's my tool of choice for Python. On Ubuntu I followed these instructions

I then installed the Python and Pylance extensions for VS Code

To check the Python environment being used, Open the Command Palette with Ctrl-Shift-P and then Python : Select Interpreter

Install Anaconda

It has its quirks but I have been using Anaconda. I installed from scratch using the instructions here.

The default installation included all the necessary libraries for simple machine learning code. 

Installing Cirq

I've been looking at Quantum Computing over the last year using Cirq. I couldn't get this installed using the instructions here.

However python -m pip install Cirq worked and it appeared in my Anaconda base (root) environment in VS Code. Success!

Installing Pygame

And finally, for some playing around with game programming I installed Pygame  with python -m pip install pygame.

So my environment is back up and running on a new OS installation.

Monday, 7 March 2022

Arrgh! Segmentation fault

My Physics analysis code runs in 3 modes. For 1996 data, for 1997 data and for both year's data. The 3rd mode also runs over different values of cuts. This is so as to build a set of results that can be used to estimate systematic errors in the analysis. For example, changing the value of the cut on Calorimeter energy can provide an indication of how well is is modeled.

To run all these checks a number of "Steering card" files were created. Each one listing out the cuts to be made for a single batch run. The analysis would then run over the entire set of '96/'97 data for each batch in turn and build a list of result ntuples.

Now, while running '96 and '97 individually was successful, when running both, on the second iteration the process crashed with the following.

ANALYSE: File contains 14614 Events
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0xb5aa8e63
#1 0xb5aa7f8e
#2 0xb7fbabbf
#3 0xb6f7fa9f
#4 0x4a0c95
#5 0x4b13c7
#6 0x4ba4bf
#7 0x4ad0c4
#8 0x4ad5ba
#9 0xb57d0f20
#10 0x494ec0
Segmentation fault (core dumped)

The dreaded Fortran segmentation fault! 

In years gone by, I would have taken the approach of writing loads of print(*,*) statements around the "ANALYSE: File contains..." log and eventually homed in on the offending code but surely I can do better now!

It turns out I can. I can add a DEBUG flag: -fcheck=all -g

../data/mc96/mc96_1.rz
ANALYSE: File contains 14614 Events
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0xb5a8de27 in ???
#1 0xb5a8cf8e in ???
#2 0xb7f9fbbf in ???
#3 0xb6f64a9f in ???
#4 0x44634d in isr_cuts96_
at ./src/isr_cuts96.fpp:408
#5 0x457df8 in anevent96_
at ./src/anevent96.fpp:147
#6 0x46282a in do_all_
at ./src/do_all.fpp:131
#7 0x453a0a in MAIN__
at ./src/analysis.fpp:44
#8 0x453f2a in main
at ./src/analysis.fpp:66
Segmentation fault (core dumped)

This is much more useful

(fcheck=all adds runtime checks for array bounds amongst other checks, -g adds extra information for the gdb debugger)

The line in isr_cuts96.fpp is:

call hf1(510,temp,1.)

This is filling a histogram  with data. The fact it's failing on the second iteration of the batch suggests the histogram isn't being correctly cleared down after the first run. 

The module I have to do the clearing down is called terminate.fpp

A review of the HBOOK User Guide led me to needing this call

CALL HDELET ID
Action: Deletes a histogram and releases the corresponding storage space.
Input parameter description:
ID identifier of a histogram. ID deletes all existing histograms.

Adding this to terminate.fpp fixed the problem and I am now able to run multiple batches in the same process

call hdelet(0)

Why this ran OK back in 2000 is probably a combination of the older CernLib library and Fortran compiler.

Monday, 31 January 2022

Updating from .NET Core 3.2 to .NET 6.0

With a couple of projects on .NET Core 3.2 I thought I would update to .NET 6.0. This was pretty simple!

First, update the TargetFramework in the project references to net6.0:
     <TargetFramework>net6.0</TargetFramework>

Next, I took the opportunity to update NuGet references to the latest versions. I also cleared the NuGet cache with:
     dotnet nuget locals --clear all

At this point Visual Studio 22 Intellisense started warning that references were missing, although the code compiled properly and the Test Runner couldn't fine any of the tests. To resolve this I deleted the .vs folder and restarted.

In the SuperMarketPlanner XAML application, to fix a new compiler warning, I changed the Project SDK in the csproj from
    Project Sdk="Microsoft.NET.Sdk.WindowsDesktop"
to 
    Project Sdk="Microsoft.NET.Sdk"

Finally, in the Azure Pipeline yaml I had to update the SDK to the following:

steps:
- task: UseDotNet@2
displayName: 'Install .NET Core 6.0.x SDK'
inputs:
version: 6.0.x
performMultiLevelLookup: true

Saturday, 22 January 2022

Running ZEUS Analysis Code


Although I built CERNLib a long time ago, apart from running PAW and looking at some old Ntuples I didn't get round to building my old ZEUS analysis code.

Setup the Environment

CERNLib is available as a package on Ubuntu.  However I've seen issues around running this on 64bit Linux. So for practicality I decided to setup a 32bit Linux VM. As noted before I opted for Linux Mint on VirtualBox. I picked the 32bit version of Mint 19.3 Tricia. 


Code Changes

So the Fortran compiler has moved on a bit since October 2000 which was the last time I ran this code. I needed to make a few changes to successfully compile my source. Fortunately nothing too significant was required.

.EQ. to .EQV.

I was using .EQ. for logical comparison of a LOGICAL variable, for example:
        IF (Q2WTFIRST.EQ..TRUE.) THEN
However this should be .EQV. :
        IF (Q2WTFIRST.EQV..TRUE.) THEN

Using 1 and 0 for True and False

Related to the above, a compiler error was thrown when using .eq. and 1 or 0 to represent true or false

tltcut = 0
... Change tltcut to 1 if the condition is met
if (tltcut.eq.1) then

Fix was to change the condition to:
        if (tltcut) then

DFLIB

At the time Bristol University were trying out Visual Fortran (DEC now Intel Visual Fortran) . We used this library in a Math error handler routine. DFLIB has quite a history but the upshot is it's only supported on Windows. I just took out the routine using it!

Makefile

Using Visual Fortran the build and linker was controlled by a. To build using gfortran in Linux I had to create a Makefile. This is an art in itself! The main issue I had to overcome was linking the CERNLIB libraries. The required libraries below, mathlib, packlib and kernlib were available to the linker with the standard path. pdflib however was in a separate folder that needed including with the -L flag. Some looking around to find the correct name was required, in the end I needed pdflib804.

-L /usr/lib/i386-linux-gnu -lmathlib -lpacklib -lkernlib -lpdflib804

.inc Files

Include files in Fortran are inserted into the source program. This is useful if you have a common block that you don't want to write over and over, for example defining variables. 

c -------------------------------------------------
c Local definitions for F2/ISR analysis
c -------------------------------------------------
c Kinematical variables
c -------------------------------------------------
real empzcal, empztot, y_el, y_sig, y_elcorr, corrected_en,
& best_th, Q2_el, x_el, bestx, besty, bestz, electron_en,
& electron_th, logyel, logxel, logq2elc, logysig, logq2el,
& logmcq2, logmcx, logmcy, elumie, elumig, q2corr,
& zempzhad, zy_jb, ccy_jb, zy_jbcorr,
& logyjbz, best_th2, gamma,gamma2,z,logmcy2
view raw common.inc hosted with ❤ by GitHub

This was referenced in the source files with :
        #include "common.inc"

Incidentally, the # indicated the file was processed by a C preprocessor rather than the standard Fortran include.

My inc files were mixed in with the Fortran source files. The compiler didn't like this so I updated the folder structure to use separate src/inc/exe folders. Funnily enough this reflected the original folder structure when I ran the code on Unix boxes before we switched to Visual Fortran. The circle is complete!

How to run the Analysis code

This took a bit of remembering. The code was driven by a number of input files. 
  • Steering cards. This contains a list of cuts made to select the events. Different cards would define different cuts and therefore allow me to analyse systematic errors
  • File listing input Data .rz ntuples
  • File listing input Background .rz ntuples
  • File listing input MonteCarlo .rz ntuples
Of course, naming convention wasn't particularly useful, for a run on 1997 data there would be:
  • stc97_n for the steering cards
  • fort.45 for the Data file list
  • fort.46 for the Background file list
  • fort.47 for the MonteCarlo file list

I should make clear at this point that the files I'm running against are not the "raw" ZEUS datafiles. A preliminary job was run against the full set of data on tape to load a cutdown set of data that passed some loose cuts and create an ntuple with the necessary fields for this particular analysis. My ntuples were saved after this first step.

Results

I ran the code against a single 1997 data, background and MonteCarlo ntuple.

This created, amongst some other files, an israll97.hbook file. By opening PAW and entering hi/file 1 israll.hbook I was able to load this. hi/li listed the available histograms, this looked about right and when I opened one of them with hi/pl 105 success! I had created a histogram of the E-Pz (Total Energy - Momentum in the z, or beampipe direction) from my saved data files. I used this originally to help select the population of events to use in my analysis.



I've put the code up on GitHub:

The README.md describes the list of input and output files.

Next steps...
  • Run the PAW macro (kumac) files I used after this analysis step and add to GitHub.
  • Run against the full data set. I want to see how quickly this runs on current hardware. To run the full 96/97 data set against all steering cards would take about 20 hours!
  • Create a GitHub release to include sample .rz files. Hopefully to allow anyone to run this!
  • Try and build the 64bit version of CERNLIB so I don't have to run on a 32bit VM.

Wednesday, 12 January 2022

Adding Azure Pipeline status to GitHub README.md

I noticed in some GitHub projects a build status badge in the README.md display. As I have Azure Pipeline builds setup for two of my projects, CSVComparer and SuperMarketPlanner, I thought it would a good idea to add this.

Handily, there is a REST API interface available for the Azure Pipeline build!

The link for the badge is of the form:

https://dev.azure.com/{Organisation}/{Project}/_apis/build/status/{pipelinename}

For CSVComparer that's: 

https://dev.azure.com/jonathanscott80/CSVComparer/_apis/build/status/jscott7.CSVComparer

Then I can add a hyperlink to the badge to navigate to the latest build page when clicked. This is of the form:

https://dev.azure.com/{Organisation}/{Project}/_build/latest?definitionId={id}

Again, for CSVComparer that's: 

https://dev.azure.com/jonathanscott80/CSVComparer/_build/latest?definitionId=2

The definitionId is the ID of the Pipeline assigned by Azure. You can find this by navigating through to the Pipeline page via https://dev.azure.com/{Organisation}/

(Useful link for the official docs here : Azure Pipelines documentation | Microsoft Docs)

Finally, to put it all together add this to the README.md file:

[![Build Status](https://dev.azure.com/jonathanscott80/CSVComparer/_apis/build/status/jscott7.CSVComparer)](https://dev.azure.com/jonathanscott80/CSVComparer/_build/latest?definitionId=2)

Here is what it looks like:


And clicking on the badge takes us here: