AWS Control Tower Enrollment Gotchas

person looking down from on top of a control tower at an airport.

I have been working on moving a collection of about 20-30 AWS accounts from two different AWS Organizations into a new AWS Organization with Control Tower enabled. During the process I have run into a number of different blockers and issues which have not always had the most obvious solutions due to cryptic errors that the AWS Control Tower enrollment process shows.

This blog post lists out some of the issues I have run into during account migrations, and what the underlying reasons and resolutions ended up being.

You can’t use the AWS IAM Identity Center provided user or account root user when enrolling accounts

If you login to the management account and try to enroll member accounts into AWS Control Tower, you’ll get a cryptic error in the AWS Console like:

An unknown error occurred. Try again later, or contact AWS Support. No launch paths found for resource: prod-xxxxxxxxxxxx

In my case, I was logged in initially with the AWS IAM Identity Center provisioned user account for the management account. This account is identified by the default Display Name of: AWS Control Tower Admin. The solution was to create a single IAM user with AdministratorAccess managed policy attached, then use the AWS Service Catalog console to add the user to the Portfolio Access on the AWS Control Tower Account Factory Portfolio item.

Once that was done, I could use the standard IAM user to login and successfully enroll member accounts into Control Tower.

In theory it should be possible to use the AWS IAM Identity Center provided user to enroll accounts into AWS Control Tower by ensuring it is added to the relevant Groups relating to provisioning and enrollment, but even after this I was unable to. Using a standard IAM user added to Portfolio access as above worked for me.

Forgetting to create the AWSControlTowerExecution IAM role in a member account before enrolling

Don’t forget to create the AWSControlTowerExecution IAM role with Principal ID access for the main Control Tower management account to assume during account enrollment.

This IAM role needs to be created before you can enroll accounts into AWS Control Tower. At a high level, it should be named AWSControlTowerExecution, it should have the AWS managed policy AdministratorAccess attached, and it should have a trust policy attached like the following:

{
   "Version":"2012-10-17",
   "Statement":[
      {
      "Effect":"Allow",
      "Principal":{
         "AWS": "arn:aws:iam::Management Account ID:root"
         },
         "Action": "sts:AssumeRole",
         "Condition": {} 
      }
   ]
  }

The process and role requirements are detailed in this AWS page.

AWS Config in existing accounts before enrolling them with AWS Control Tower

Here’s another issue that caught me out on one account. The account had it’s own AWS Config service Recorder and Delivery Channel setup (but it wasn’t actually being actively used).

When AWS Control Tower enrolls an account, it configures AWS Config with various best practices and configuration settings to record config changes. If there is existing AWS Config in the account, the enrollment process fails.

In my case I got the lovely error message:


I was scratching my head on this one until I decided to use the Update feature in the Control Tower console to try enrollment once more (where the account enrollment status showed Failed). The second time around, a clearer message was displayed:

AWS Control Tower could not enroll your account for the following reason: AWS Control Tower cannot create an AWS Config delivery channel because one already exists. To continue, delete the existing delivery channel and try again.

Trying to delete the AWS Config delivery channel was impossible though. AWS Control Tower applies guard rails via SCP, and because this account was partly enrolled, and now moved to an Organizational Unit (OU) that had SCPs attached to prevent AWS Config changes, it was impossible to remove the AWS Config settings that were blocking the enrollment process.

The fix was to ‘unmanage’ or un-enroll the AWS account (even though it wasn’t fully enrolled) using the AWS Control Tower console. Once that was done, and the account was moved back into the Organization root OU, I was able to use the AWS CLI to remove the offending AWS Config settings.

# find any delivery channels
aws configservice describe-delivery-channels

# describe delivery channel status
aws configservice describe-delivery-channel-status

# stop configuration recorder named 'default' (seen in describe above)
aws configservice stop-configuration-recorder --configuration-recorder-name default

# delete configuration recorder named 'default'
aws configservice delete-configuration-recorder --configuration-recorder-name default

With the AWS Config preferences removed, the AWS Config console should show the initial ‘Set up AWS Config’ option – i.e. nothing is configured. At this stage it’s possible to enroll the account into Control Tower successfully.

Feature Photo by id23: https://www.pexels.com/photo/person-on-air-traffic-control-tower-10893728/

error TS2582: Missing types in a TypeScript project

This is a quick note as a pointer to anyone running into type errors like error TS2582. You might be working with Jest, TypeScript, and a monorepo setup, using something like lerna. I was porting over some projects into a monorepo and had a tsconfig.json issue which was cause for this error. You might be seeing errors similar to: error TS2582: Cannot find name 'test'. Do you need to install type definitions for a test runner? Try npm i --save-dev @types/jest or npm i --save-dev @types/mocha.

As it turns out, my issue was that I had a tsconfig.json file with typeRoots configured to point to the package’s own node_modules directory. Like this:

{
  "compilerOptions": {
    "...": "...",
    "typeRoots": ["./node_modules/@types"]
}

As this was a monorepo, common types such as those from jest were installed in the repository root. Meaning a package tsconfig.json file under package/example-package, referencing the location of “./node_modules/@types” was incorrect.

The fix was to simply remove the typeRoots setting from the package, or change it to point a further level down to the root: “../../node_modules/@types“.

To quote the docs on typeRoots, if you explicitly set typeRoots, then you’re narrowing down the locations that these will be pulled in from (compared to the default of not setting them).

By default all visible@types” packages are included in your compilation. Packages in node_modules/@types of any enclosing folder are considered visible. For example, that means packages within ./node_modules/@types/, ../node_modules/@types/, ../../node_modules/@types/, and so on.

TypeScript TSConfig Reference

Fix for VM console error – unable to connect to the mks the operation is not allowed in the current state

Bit of a strange one this – I have not dug deeper to find the root cause, but here is a quick fix for anyone with the issue.

mks-console-error-vm

 

I found we could not open VM console sessions in a vCenter 5.5 environment today. Usually one’s first thought is that it is a DNS or port issue when you see the classic MKS console error in a VM, but in this case I knew that DNS and ports were not an issue, as RDPing direct to the vCenter Server itself, logging in with the C# client and opening VM consoles from there were giving the exact same message. This was the case for the web client as well as the C# client.

The issue was either with the host that VMs were running on, or with the VMs themselves – the simple fix:

vMotion the VM to another host. As soon as this was done, I could open the console session. The underlying issue is still out there, but I have not had the time to dig any deeper to find out the root cause. More discussion and info available from this VMware communities thread: https://communities.vmware.com/thread/450294

Live Migrating a VM on a Hyper-V Failover Cluster fails – Processor-specific features not supported

 

I have been working on setting up a small cluster of Hyper-V Hosts (running as VMs), nested under a bunch of physical VMware ESXi 5.0 hosts. Bear in mind I am quite new to Hyper-V, I have only ever really played with single host Hyper-V setups in the past. Having just finishing creating a Hyper-V failover cluster in this nested environment, and configuring CSV (Cluster Shared Volume) Storage for the Hyper-V hosts, I created a single VM to test the “live migrate” feature of Hyper-V. Upon telling the VM to live migrate from host “A” to host “B”, I got the following error message.

“There was an error checking for virtual machine compatibility on the target node”. The description reads “The virtual machine is using processor-specific features not supported on physical computer “DEVHYP02E”.

 

So my first thought was, perhaps there is a way to mask processor features, similar to the way VMware’s EVC for host physical CPU compatibility works? If you read the rest of the error message it does seem to indicate that there is a way of modifying the VM to limit processor features used.

 

So the solution in this case is to:

  • First of all power down your VM
  • Using Hyper-V Manager, right-click the VM and select “Settings”
  • Go to the “Processor” section and tick the option on for “Migrate to a physical computer with a different processor version” under “Processor compatibility”
  • Apply settings
  • Power up the VM again

 

Processor compatibility settings - greyed out here as I took the screenshot after powering the VM up again.

 

So now if you try and live migrate to another compatible Hyper-V host, the migration should work.

 

Troubleshooting & Fixing VMware Host Profile errors

 

Synopsis

 

Trying to apply a Host profile created from another Host in a cluster today I got an error message which resulted in only some of the host profile actually being applied.

 

A specifed parameter was not correct. changedValue.key

 

Error message received after trying to apply profile to Host

 

I thought that the error message looked familiar, but couldn’t quite remember at the time, so I left what I was doing to take a look at again later. On my way home this evening I had a bit of a brainwave – the ESX host I had taken the original profile from was a slightly different update level (2) as opposed to the update level of the newer host I was applying the profile to. I also remembered where I had seen the text “changedValue.key” in the error message before – changing Advanced Settings on a Host using PowerCLI! This gave me a good idea as to where to look for the issue I was having with this Host Profile – the Advanced Settings in the Host Profile.

 

I knew it was probably to do with a value that was different between hosts because of their differing update levels, but to gather more information I decided to hit the log files to find out more… Opening up /var/log/vmware/hostd.log on the Host and navigating down to the time I tried to apply the Host Profile I found this (interesting bit in the screenshot below, full log text in the section after that):

 

Interesting bits of information that help point to the issue in hostd.log

 

[2012-02-20 19:50:07.849 F66966D0 info 'TaskManager'] Task Created : haTask-ha-host-vim.option.OptionManager.updateValues-896
[2012-02-20 19:50:07.853 F66966D0 verbose 'VersionOptionProvider'] Attempt to set readonly option
[2012-02-20 19:50:07.853 F66966D0 info 'App'] AdapterServer caught exception: vmodl.fault.InvalidArgument
[2012-02-20 19:50:07.853 F66966D0 info 'TaskManager'] Task Completed : haTask-ha-host-vim.option.OptionManager.updateValues-896 Status error
[2012-02-20 19:50:07.853 F66966D0 info 'Vmomi'] Activation [N5Vmomi10ActivationE:0x5cf27a98] : Invoke done [updateValues] on [vim.option.OptionManager:ha-adv-options]
[2012-02-20 19:50:07.853 F66966D0 verbose 'Vmomi'] Arg changedValue:
(vim.option.OptionValue) [
   (vim.option.OptionValue) {
      dynamicType = <unset>,
      key = "Misc.HostAgentUpdateLevel",
      value = "2",
   },
   (vim.option.OptionValue) {
      dynamicType = <unset>,
      key = "Misc.HostAgentUpdateLevel",
      value = "2",
   }
]
[2012-02-20 19:50:07.853 F66966D0 info 'Vmomi'] Throw vmodl.fault.InvalidArgument
[2012-02-20 19:50:07.853 F66966D0 info 'Vmomi'] Result:
(vmodl.fault.InvalidArgument) {
   dynamicType = <unset>,
   faultCause = (vmodl.MethodFault) null,
   invalidProperty = "changedValue.key",
   msg = "",
}
[root@hostnamehere vmware]#

 

The cause:

 

So we can see that the Host Profile did a “change value” (changedValue) on the key “Misc.HostAgentUpdateLevel” and this is where our error was thrown with an “invalidProperty” (changedValue.key). If we google the message “vmodl.fault.InvalidArgument” we’ll arrive at the VMware SDK Reference Guide which states that “An InvalidArgument exception is thrown if the set of arguments passed to the function is not specified correctly.” In this case we’ll soon see that this is happening because the value that is trying to be changed is actually a read-only value for the Host – as it should be, as it just references the update level of the host – this wouldn’t normally be something you want to change.

 

The issue here was of course that original host off which the profile was based is update 2, whereas the new host having the profile applied is update 4. The two settings differ, therefore Host Profiles tries to change this value on the new Host. The setting is really read-only, therefore Host Profiles fails to apply the value and throws this error message at us, which also results in the rest of our host profile (annoyingly) not being applied. Ideally if Host Profiles found a read-only value that shouldn’t be changed, it would not change this value.

 

Solution:

 

So the simple solution is to either:

 

  • Take a Host Profile from a Host with the settings you need which is on the same update level as the Hosts you will be applying this profile to.
  • Modify this Host Profile (edit) and remove the Advanced Setting for “Misc.HostAgentUpdateLevel“.

 

In my case, I was testing the host profile on a clean ESX Host before using it for other Hosts – that meant I also only had one new ESX host of this particular update level and therefore couldn’t use the first option (take the profile from an existing host). So I therefore just went to Home -> Host Profiles and edited this Host Profile to get rid of the unnecessary key called “Misc.HostAgentUpdateLevel” like so:

 

Remove the two entries for "Misc.HostAgentUpdateLevel" from the Host Profile

 

After removing the entries referring to this read-only key, I simply re-applied the profile and this time around all the settings went on as expected and there was no more error message. So to sum it all up, check that you aren’t first of all taking a Host Profile from a reference host of a different update level as your target hosts (and if you have to you can then resort to manually editing your profile as I did). If you get cryptic errors applying your Host Profiles, check your Host log files for more info and clues as to where the issue may lie.